Liquid is an open-source template language created by Shopify back in 2006 and written in Ruby. It is widely used by several frameworks, with Jekyll being one of the most famous.
This website is created using Jekyll, specifically my Jekyll template Chulapa (link).
Some time ago, @cargocultprogramming
opened dieghernan/chulapa#29
because one of the components of the theme was broken in Jekyll =>4.1.0
.
Digging a bit, I saw
jekyll/jekyll#8214, exposing
the same issue. What seemed to be a feature was indeed a bug that some
developers were exploiting.
The change is that when applying the group_by
Liquid filter on an array, it
used to produce a “grouped” version of the array, while on Jekyll =>4.1.0
it produces a different result that can’t be used in the same way.
{% assign alldocs = site.exercises %}
{% assign grouptag = alldocs | map: 'tags' | join: ',' | split: ',' | group_by: tag %}
{{ grouptag }}
<!-- Jekyll < 4.1.0 result -->
{"name"=>"tag A", "items"=>["tag A"], "size"=>1}{"name"=>"Tag B", "items"=>["Tag B"], "size"=>1}{"name"=>"Virtualbox", "items"=>["Virtualbox"], "size"=>1}{"name"=>"netcat", "items"=>["netcat"], "size"=>1}{"name"=>"whois", "items"=>["whois"], "size"=>1}{"name"=>"dig", "items"=>["dig"], "size"=>1} ... {"name"=>"Hydra", "items"=>["Hydra"], "size"=>1}
<!-- Jekyll >= 4.1.0 result -->
{"name"=>"", "items"=>["tag A", "Tag B", "Virtualbox", "netcat", "whois", "dig", ... , "Hydra"], "size"=>26}
So basically, counting items was not easy anymore. I developed a solution in pure Liquid (which happens to be a quite verbose language out of the predefined filters) that is compatible with any Jekyll version.
The algorithm is now implemented in Chulapa. You can check the results on my /tags page.
Note that the tables produced in the example are taken from my live site, hence they may change as I add more posts. The results of the table should be the same as the order and number of tags displayed on the /tags page.
Alternative group_by
with Liquid
First, we define an array of all the tags included in the documents of my site:
{% assign alldocs = site.documents %}
{% assign alltags = alldocs | map: 'tags' | join: ',' | split: ',' %}
Cool! Now we can count the number of unique elements in
alltags
by counting the occurrences of unique tags in the
array:
<!-- Allocating array to group_by: replacement -->
<!-- Unique values -->
{% assign single_tags = alltags | uniq %}
<!-- Arrays to populate -->
{% assign count_tags = '' | split: ',' %}
<!-- Iterator 0 to number of unique tags - 1 (size = number of unique tags) -->
{% assign n_tags = single_tags | size | minus: 1 %}
{% for i in (0..n_tags) %}
<!-- Populate -->
{% assign count_this_tag = alltags | where_exp:"item", "item == single_tags[i]" | size %}
{% assign count_tags = count_tags | push: count_this_tag %}
{% endfor %}
<!-- Display single_tags and count_tags as a table -->
<table>
<caption>Display count of tags on this site </caption>
<tr>
<th>Tag</th>
<th>Count</th>
</tr>
{% for i in (0..n_tags) %}
<tr>
<td>{{ single_tags[i] }}</td>
<td>{{ count_tags[i] }}</td>
</tr>
{% endfor %}
</table>
See results
Tag | Count |
---|---|
r_bloggers | 20 |
rstats | 21 |
rspatial | 21 |
sf | 12 |
maps | 25 |
vignette | 1 |
rnaturalearth | 3 |
function | 4 |
leaflet | 5 |
jekyll | 2 |
html | 2 |
beautiful_maps | 7 |
giscoR | 9 |
raster | 1 |
flags | 4 |
mapSpain | 3 |
Wikipedia | 1 |
cartography | 4 |
svg | 1 |
inset | 3 |
r_package | 5 |
classInt | 1 |
terra | 6 |
rasterpic | 1 |
ggplot2 | 8 |
tmap | 1 |
mapsf | 1 |
discontinued | 8 |
project | 11 |
R | 5 |
python | 2 |
guest-author | 1 |
COVID19 | 3 |
ggridges | 1 |
tidyterra | 4 |
maptiles | 1 |
s2 | 1 |
astronomy | 2 |
celestial | 2 |
geojson | 2 |
gpkg | 2 |
resmush | 1 |
liquid | 1 |
chulapa | 1 |
pebble | 4 |
watchface | 4 |
javascript | 4 |
C | 4 |
webscrapping | 1 |
dataset | 2 |
csv | 2 |
json | 1 |
1 |
Sorting
How to rank the tags by the number of occurrences? We can set the maximum number of occurrences and loop in reverse order. The ranked array would be populated if a tag presents the number of occurrences in the main loop:
<!-- Used in https://github.com/mmistakes/minimal-mistakes/blob/master/_includes/posts-taxonomy.html -->
{% assign items_max = count_tags | sort | last %}
{% assign sorted_tags = '' | split: ',' %}
{% assign sorted_count_tags = '' | split: ',' %}
{% for i in (1..items_max) reversed %}
{% for j in (0..n_tags) %}
{% if count_tags[j] == i %}
{% assign sorted_tags = sorted_tags | push: single_tags[j] %}
{% assign sorted_count_tags = sorted_count_tags | push: i %}
{% endif %}
{% endfor %}
{% endfor %}
{% assign sorted_tags = sorted_tags | uniq %}
<table>
<caption>Display sorted count of tags on this site </caption>
<tr>
<th>Tag</th>
<th>Count (desc sorted)</th>
</tr>
{%- for i in (0..n_tags) %}
<tr>
<td>{{ sorted_tags[i] }}</td>
<td>{{ sorted_count_tags[i] }}</td>
</tr>
{%- endfor -%}
</table>
See results
Tag | Count (desc sorted) |
---|---|
maps | 25 |
rstats | 21 |
rspatial | 21 |
r_bloggers | 20 |
sf | 12 |
project | 11 |
giscoR | 9 |
ggplot2 | 8 |
discontinued | 8 |
beautiful_maps | 7 |
terra | 6 |
leaflet | 5 |
r_package | 5 |
R | 5 |
function | 4 |
flags | 4 |
cartography | 4 |
tidyterra | 4 |
pebble | 4 |
watchface | 4 |
javascript | 4 |
C | 4 |
rnaturalearth | 3 |
mapSpain | 3 |
inset | 3 |
COVID19 | 3 |
jekyll | 2 |
html | 2 |
python | 2 |
astronomy | 2 |
celestial | 2 |
geojson | 2 |
gpkg | 2 |
dataset | 2 |
csv | 2 |
vignette | 1 |
raster | 1 |
Wikipedia | 1 |
svg | 1 |
classInt | 1 |
rasterpic | 1 |
tmap | 1 |
mapsf | 1 |
guest-author | 1 |
ggridges | 1 |
maptiles | 1 |
s2 | 1 |
resmush | 1 |
liquid | 1 |
chulapa | 1 |
webscrapping | 1 |
json | 1 |
1 |
Bottom line
Done! Here you have a clean version of the algorithm:
{% assign alldocs = site.documents %}
{% assign alltags = alldocs | map: 'tags' | join: ',' | split: ',' %}
{% assign single_tags = alltags | uniq %}
<!-- Counting -->
{% assign count_tags = '' | split: ',' %}
{% assign n_tags = single_tags | size | minus: 1 %}
{% for i in (0..n_tags) %}
{% assign count_this_tag = alltags | where_exp:"item", "item == single_tags[i]" | size %}
{% assign count_tags = count_tags | push: count_this_tag %}
{% endfor %}
<!-- Extra: sort -->
{% assign items_max = count_tags | sort | last %}
{% assign sorted_tags = '' | split: ',' %}
{% assign sorted_count_tags = '' | split: ',' %}
{% for i in (1..items_max) reversed %}
{% for j in (0..n_tags) %}
{% if count_tags[j] == i %}
{% assign sorted_tags = sorted_tags | push: single_tags[j] %}
{% assign sorted_count_tags = sorted_count_tags | push: i %}
{% endif %}
{% endfor %}
{% endfor %}
{% assign sorted_tags = sorted_tags | uniq %}