
Clojure collections and their basic functions
There are four collection types in Clojure:
- Lists
- Vectors
- Maps
- Sets
In this recipe, we will describe what these types are and some basic functions for them.
Getting ready
You only need REPL described in the recipe Repl up! in Chapter 1, Live Programming with Clojure, and no additional libraries. Start REPL so that you can review the sample code in this recipe.
How to do it...
We will learn collection types in Clojure including lists, vectors, maps, and sets. We will learn how to create them and use basic functions for them.
Lists
Lists are commonly used in Lisp. Clojure also supports the list data type. Lists are internally implemented as a linked list. To create a list, begin with quote ('
) and then enclose elements with ()
. If you want to create an empty list, use '
()
, or (list)
:
'("A Study in Scarlet" "The Sign of the Four" "The Hound of the Baskervilles" "The Valley of Fear") ;;=> ("A Study in Scarlet" "The Sign of the Four" "The Hound of the Baskervilles" "The Valley of Fear") '() ;;=> () (list) ;;=> ()
There are three basic functions, car
, cdr
, and cons
, to manipulate lists in traditional Lisp. Instead of them, Clojure uses first
, rest
, and conj
. Let's see how they work:
first
takes the head element from a list:(first '(1 2 3 4 5)) ;;=> 1 (first '()) ;;=> nil (first nil) ;;=> nil
- The
first
of an empty list ornil
returnsnil
. rest
returns a list after the first element:(rest '(1 2 3 4 5)) ;;=> (2 3 4 5) (rest '()) ;;=> () (rest nil) ;;=> ()
- The
rest
of an empty list ornil
returns an empty list. conj
adds elements at the beginning:(conj '(2 3 4 5) 1) ;;=> (1 2 3 4 5)
conj
is a variadic function and can take multiple elements as its arguments:```clj (conj '("d" "e" "f") "c" "b" "a") ;;=> ("a" "b" "c" "d" "e" "f")
Vectors
A vector also represents a sequence of elements, like list, but it supports efficient access to its elements by index.
To create a vector, enclose its elements with []
. To create an empty vector, use []
or (vector)
:
["A Study in Scarlet" "The Sign of the Four" "The Hound of the Baskervilles" "The Valley of Fear"] ;;=> ["A Study in Scarlet" "The Sign of the Four" "The Hound of the Baskervilles" "The Valley of Fear"] [] ;;=> [] (vector) ;;=> []
We can use first
, rest
, and conj
with vectors. Let's begin with first
and rest
:
(first [1 2 3 4 5]) ;;=> 1 (first []) ;;=> nil (rest [1 2 3 4 5]) ;;=> (2 3 4 5)
The rest
of the vector looks like a returning list, but it is not list. It is ChunkedSeq
:
(class (rest [1 2 3 4 5])) ;;=> clojure.lang.PersistentVector$ChunkedSeq
To add elements to a vector, use the conj
function. conj
appends elements to the vector:
(conj [1 2 3 4] 5) ;;=> [1 2 3 4 5]
Maps
Maps are associative lists, and they can be accessed efficiently with keys and return their values.
To create a map, enclose key-value pairs with {}
:
{:name "Arthur Ignatius Conan Doyle" :born "22-May-1859" :died "7-July-1930" :occupation ["novelist" "short story writer" "poet" "physician"] :nationality "scotish" :citizenship "United Kingdom" :genre ["Detective fiction", "fantasy", "science fiction", "historical novels", "non-fiction"] :notable-works ["Stories of Sherlock Holmes" "The Lost World"] :spouse ["Louisa Hawkins" "Jean Leckie"] :no-of-children 5 } ;;=> {:genre ["Detective fiction" "fantasy" "science fiction" "historical novels" "non-fiction"], :occupation ["novelist" "short story writer" "poet" "physician"], :name "Arthur Ignatius Conan Doyle", :no-of-children 5, :nationality "scotish", :died "7-July-1930", :spouse ["Louisa Hawkins" "Jean Leckie"], :notable-works ["Stories of Sherlock Holmes" "The Lost World"], :citizenship "United Kingdom", :born "22-May-1859"}
To create an empty map, use {}
or (hash-map)
:
{} ;;=> {} (hash-map) ;;=> {}
We can use first
, rest
, and conj
for maps, just as for lists or vectors:
(first {:a 1, :b 2, :c 3 :d 4 :e 5}) ;;=> [:a 1] (rest {:a 1, :b 2, :c 3 :d 4 :e 5}) ;;=> ([:b 2] [:c 3] [:d 4] [:e 5]) (conj {:a 1, :b 2, :c 3 :d 4 } [:e 5]) ;;=> {:a 1, :b 2, :c 3, :d 4, :e 5} (conj {:a 1, :b 2, :c 3 :d 4 } [:d 5] [:e 6]) ;;=> {:a 1, :b 2, :c 3, :d 5, :e 6} (conj {:a 1, :b 2, :c 3 :d 4 } [:d 5]) ;;=> {:a 1, :b 2, :c 3, :d 5}
Maps are often used as dictionaries. get
, dissoc
, and assoc
are more commonly used with maps.
get
looks up a map entry by its key and returns its value. If get
cannot find the key in a collection, it returns nil
.
If you don't want to obtain nil
in such a case, specify the value you want when the key does not exist:
(get {:a 1, :b 2, :c 3 :d 4 :e 5} :c) ;;=> 3 (get {:a 1, :b 2, :c 3 :d 4 :e 5} :f) ;;=> nil (get {:a 1, :b 2, :c 3 :d 4 :e 5} :f :not-found) ;;=> :not-found
assoc
adds key-value pairs to maps. If there is already a key-value pair in the map, it replaces the value:
(assoc {:a 1, :b 2, :c 3 :d 4} :e 5) ;;=> {:a 1, :b 2, :c 3, :d 4, :e 5} (assoc {:a 1, :b 2, :c 3 :d 4} :c 5) ;;=> {:a 1, :b 2, :c 5, :d 4}
assoc
can use multiple key-value pairs as its arguments:
(assoc {:a 1, :b 2, :c 3 :d 4} :e 5 :f 6) ;;=> {:a 1, :b 2, :c 3, :d 4, :e 5, :f 6}
The dissoc
removes key-value pairs from a map:
(dissoc {:a 1, :b 2, :c 3 :d 4} :c) ;;=> {:a 1, :b 2, :d 4} (dissoc {:a 1, :b 2, :c 3 :d 4} :c) ;;=> {:a 1, :b 2, :d 4} (dissoc {:a 1, :b 2, :c 3 :d 4} :e) ;;=> {:a 1, :b 2, :c 3, :d 4}
Sets
Sets maintain the uniqueness of their elements. To create a set, being with a hash (#
) and then enclose elements with {}
. If you want to create an empty set, use #{}
or (hash-set)
:
#{"apple" "orange" "banana" "peach" "strawberry"} ;;=> #{"peach" "apple" "banana" "orange" "strawberry"} #{} ;;=> #{} (hash-set)
We can use first
, rest
, and conj
for sets. Sets do not keep the order of elements as you input data, so first
or rest
may look different to what you expected:
(first #{:a :b :c :d :e}) ;;=> :e (rest #{:a :b :c :d :e}) ;;=> (:c :b :d :a) (conj #{:a :b :c :d} :e) ;;=> #{:e :c :b :d :a} (conj #{:a :b :c :d :e} :e) ;;=> #{:e :c :b :d :a}
The contains?
function examines whether the element exists in a set, whereas disj
eliminates the element from the set:
(contains? #{:a :b :c :d :e} :c) ;=> true (contains? #{:a :b :c :d :e} :f) ;;=> false (disj #{:a :b :c :d :e} :c) ;;=> #{:e :b :d :a} (disj #{:a :b :c :d :e} :c :d) ;;=> #{:e :b :a}
How it works...
All collection types in Clojure are persistent and immutable. A persistent data structure preserves the previous versions of itself when it is modified. Clojure has efficient implementation to create modified versions of collections so it doesn't simply copy the entire data structure, but shares the original data.
Clojure collections
All collection types implement the clojure.lang.Seqable
interface and its seq
method returns clojure.lang.ISeq
. ISeq
has the cons
, first
, more
, and next
functions:
public interface Seqable { ISeq seq(); } public interface ISeq extends IPersistentCollection { Object first(); ISeq next(); ISeq more(); ISeq cons(Object o); }
ISeq
implements IPersistentCollection
and it has count
, cons
, empty
, and equiv
:
public interface IPersistentCollection extends Seqable { int count(); IPersistentCollection cons(Object o); IPersistentCollection empty(); boolean equiv(Object o); }
Differences between lists and vectors
Lists and sets represent sequences of elements, but there are some differences:
- Lists are
clojure.lang.PersistentList
and vectors areclojure.lang.PersistentVector
. - Lists are implemented as a linked list. Vectors are implemented using a tree structure, and each node has 32 elements.
- Vectors are efficient for index access. Index access for lists takes
O(n)
, whereas it takesO(log32(n))
for vectors. - That is why, with
conj
, the order of the results is different in lists and vectors.conj
puts an element at the beginning of lists, but it puts an element at the end of vectors:
(conj '(2 3 4 5) 1) ;;=> (1 2 3 4 5) (conj [1 2 3 4] 5) ;;=> [1 2 3 4 5]
The following code tests index access for a list and a vector. This test measures access to the last element in a collection consisting of 1 million elements. The test loops 100 times. Access to the list takes half a second, but access to the vector only takes 0.1 milliseconds:
(let [n 1000000 iter 100 l (into () (repeat n "a")) v (into [] (repeat n "a")) ] (time (dotimes [_ iter](nth l (dec n) ))) (time (dotimes [_ iter](nth v (dec n) ))) ) "Elapsed time: 532.623841 msecs" "Elapsed time: 0.167739 msecs"
Reversing the order of vectors is faster than reversing the order of lists:
(do (time (reverse (into '() (range 100000)))) (time (reverse (into [] (range 100000)))) nil ) "Elapsed time: 60.67435 msecs" "Elapsed time: 26.431954 msecs" nil
The reason is that clojure.lang.PersistentVector
implements clojure.lang.Reversible
but clojure.lang.PersistentList
does not.
Clojure is immutable
Since Clojure functions for collections are immutable, it does not modify the original collection. It is different from Python or Ruby. In the following Python code, insert modifies the original list:
l = [2, 3, 4, 5] l.insert(0, 1) l #[1, 2, 3, 4, 5]
The next Ruby code also modifies the original list:
$ irb irb(main):001:0> letters = Array.new([2,3,4]) => [2, 3, 4] irb(main):002:0> letters.insert(0, 1) => [1, 2, 3, 4]
But Clojure does not change the original data:
(def l '(2 3 4 5)) #'collection.core/l (conj '(2 3 4 5) 1) ;;=> (1 2 3 4 5) l ;;=> (2 3 4 5)
Using immutable data ensures that arguments are unmodified after returning from other function calls. On the other hand, calling functions causes unexpected modification of arguments, which causes bugs that are difficult to find.
There's more...
Here, we will show you some techniques for manipulating collections.
Converting collections between different types
into
converts types in collections. The following code converts between a vector and a list:
(into [] '(1 2 3 4 5)) ;;=> [1 2 3 4 5] (into '() [1 2 3 4 5]) ;;=> (5 4 3 2 1)
In the following code, into
converts between a map and a vector:
(into {} [[:a 1][:b 2] [:c 3][:d 4][:e 5]]) ;;=> {:a 1, :b 2, :c 3, :d 4, :e 5} (into [] {:a 1 :b 2 :c 3 :d 4 :e 5}) ;;=> [[:a 1] [:b 2] [:c 3] [:d 4] [:e 5]]
Finally, this into
converts between a set and a vector:
(into [] #{:a :b :c :d :e}) ;;=> [:e :c :b :d :a] (into #{} [:a :b :c :d :e]) #{:e :c :b :d :a}
Using distinct
distinct
is similar to into
with sets. It eliminates duplicated elements:
(distinct ["apple" "orange" "melon" "grape" "orange" "pinapple"]) ;;=> ("apple" "orange" "melon" "grape" "pinapple") (into #{} ["apple" "orange" "melon" "grape" "orange" "pinapple"]) ;;=> #{"apple" "pinapple" "orange" "grape" "melon"}
Unlike set
, using distinct
does not change the data type.