Reference Types, or “Why some variable values change without me touching them…”

Background

In Ruby, if a new array is created by referencing to an existing array, and the value is of the new array is altered, the original array's value will also change. This may seem counter-intuitive if you're new to programming until you understand that array's are reference types by nature.

Before I go into a bit more detail, let me illustrate this with some code, first with strings and then with arrays:

Strings

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
string1 = "hello"
string2 = string1
string1.object_id
# 70297680246440
string2.object_id
# 70297680246440
string2 = "world"
# "world"
string2.object_id
# 70297676448900

Above we created a variable called string1 and gave it a value of “hello”. We then created a new variable called string2 and assigned it the value of string1. If we inspect the object id's of both variables, one can see that they have the same id. This means that both objects’ values point to the same space in memory.

After the value of string2 is altered, it's object id changes. That's because a new space in memory was allocated to store this updated value.

Arrays

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
arr1 = [1,2,3]
arr2 = arr1
arr1.object_id
# 70192684545360
arr2.object_id
# 70192684545360
arr2.pop
# 3
arr2
# [1, 2]
arr1
# [1, 2]  ???

As with the Strings example, we created an array called arr1 and gave it a value of [1,2,3]. We then created a new array called arr2 and assigned it the value of arr1. Once again, after inspecting the object id's of both arrays, we can see that the values point to the same space in memory.

Now here comes the interesting part.

After popping off of the last element from arr2, you might expect Ruby to allocate a new space in memory for arr2 and only perform the alteration on that one variable. But, no. As shown above, the value of arr1 has also been altered.

Why? Because the value of arr2 was never reassigned, e.g.

1
2
3
arr2 = arr1.pop
arr1.object_id == arr2.object_id  # Check if both object_id's are the same
# false

I lied. Look closer at Strings

I'm not comparing apples with apples in the examples above. I've done this on purpose to show a common incorrect thought process while writing code. At least a mistake that I made a couple of times…

The Strings example is different because I reassigned the value of the second variable.

A correct comparison would be something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
string1 = "hello"
# "hello"
string2 = string1
# "hello"
string2.upcase!
# "HELLO"
string1
# "HELLO"
string1.object_id == string2.object_id
# true

Now you can see how the value of string1 was altered, even though it was never touched directly.

Apples with apples…

Reference Types

Think of a Reference Type as an object where the value is merely a pointer/address to a location in memory that contains the real value. Reference Types typically live in the Heap (a section of memory).

Also, remember that in Ruby everything is an object.

Does that mean that all objects are Reference Types? The short answer is yes, although there are some edge-cases.

Summary

When a new variable is created by assigning its value to that of an existing variable, the value will not be copied, but rather the reference pointing to the existing variable's value.

The mere alteration of a variable's value will not cause Ruby to assign a new section of memory to said variable (and copy the value to it) - you need to assign a value to the new variable to trigger that sort of memory allocation.

I hope this made at least some sense…

🥔