Variables and variable assignment

A variable is a placeholder for a value and is identified by a valid name and a type. Variable types are those that PostgreSQL allows within a table definition, including user-defined types. A variable can either hold a single value or multiple values of the same type in an array. If it holds a single value, such as an integer, it is called a scalar.

A variable can also handle more complex types, such as tuples, which represent table rows or query results. These variables are called records. There are two specific types of record variable: record and rowtype. The record type is a general abstract type that can handle any specific tuple structure, while rowtype can handle well-defined tuple structures and is therefore tied to a specific result set structure. In other words, a record variable can interchangeably hold a tuple coming from the files table or the tags table, while a rowtype variable must be associated with a particular table. As they are complex structures, both the record and rowtype variables must be de-referenced to obtain a single value (such as a column value). This is done using a . symbol followed by the name of the field to be accessed. More details about these two types of variables will be discussed later.

We can assign values to a variable in two ways:

  • The := assignment operator (or its equivalent shorter form, =)
  • The INTO predicate in conjunction with an SQL statement that returns tuples

The := assignment operator works for an inline single value assignment of either a scalar or complex type. The INTO predicate, on the other hand, can be used to perform multiple assignments in a single statement, such as assigning columns of a row to separate scalar variables. If required, the assigned value can be coerced through a cast to the type of the variable it is assigned to.

The INTO predicate is mainly used in conjunction with the SELECT statement, allowing for the assignment of multiple values or the assignment of values extracted from a query, such as a row. We can also use the INTO predicate with any statement that returns a tuple. These statements are INSERT, UPDATE, or DELETE with the RETURNING clause.

As shown in Listing 3, variables are first assigned one at a time via the := operator. Then, a single SELECT INTO statement assigns different values at the i and t variables at the same time:

testdb=> DO $code$
DECLARE
i int; -- initial value is NULL
t text := 'Hello World!';
d date := CURRENT_DATE;
BEGIN
RAISE INFO 'Variables are: d = [%], t = [%], i = [%]', d, t, i;
-- single assignment
i := length( t );
d := now(); -- coerced to 'date'
-- multiple assignment!
SELECT 'Happy new ' || EXTRACT( year FROM d ) + 1 || ' year!', i * i
INTO t, i;

RAISE INFO 'Variables now are: d = [%], t = [%], i = [%]', d, t, i;
END $code$;

INFO: Variables are: d = [2018-07-12], t = [Hello World!], i = [<NULL>]
INFO: Variables now are: d = [2018-07-12], t = [Happy new 2019 year!], i = [144]
Listing 3:  Variable assignments

Under the hood, the := operator implies the execution of a SELECT statement to assign the value. Listing 4 shows how to use the INTO predicate with both an INSERT statement and a SELECT statement:

testdb=> DO $code$
DECLARE
file_name text := 'TestFile.txt';
file_size numeric := 123.56;
file_pk int;
file_ts timestamp;
BEGIN
-- insert a new file and get back the key
INSERT INTO files( f_name, f_size )
VALUES ( file_name, file_size )
RETURNING pk
INTO file_pk;

RAISE INFO 'File [%] has been inserted with pk = %', file_name, file_pk;

-- query back to get the ts column
SELECT ts INTO file_ts
FROM files WHERE pk = file_pk;
RAISE INFO 'Timestamp is %', file_ts;
END $code$;

OUTPUT
File [TestFile.txt] has been inserted with pk = 27
Timestamp is 2018-07-12 10:56:17.600594
Listing 4:  Using the INTO predicate

What happens if there is more than one row that is subject to assignment? This depends on the statement that performs the assignment: in the case of an INSERT, UPDATE, or DELETE statement with a RETURNING clause, an error is thrown. This means that only one tuple at a time can be inserted or modified while being targeted at a variable assignment. In the case of a SELECT statement, it is possible to get more than one row as a result, but only the first tuple is considered as part of the assignment and all subsequent tuples are discarded.

As an alternative, we can use the INTO STRICT optional form, which imposes a single row returned from a SELECT statement and therefore ensures that exactly one row value is assigned, resulting in an error if more than one row is returned by the query. The code snippet of Listing 5 works, but throws away all rows except the first one:

testdb=> DO $code$
DECLARE
file_name text;
file_size numeric;
BEGIN
SELECT f_name, f_size
INTO file_name, file_size
FROM files WHERE f_type = 'png'
ORDER BY f_name;

RAISE INFO 'The first png file is %', file_name;
END $code$;

INFO: The first png file is picture1.png
Listing 5:  Discarding all the rows except the first one

The code of Listing 6, however, does not work because STRICT throws an exception since the query returns more than one row: 

testdb=> DO $code$
DECLARE
file_name text;
file_size numeric;
BEGIN
SELECT f_name, f_size
INTO STRICT file_name, file_size
FROM files WHERE f_type = 'png'
ORDER BY f_name;

RAISE INFO 'The first png file is %', file_name;
END $code$;

ERROR: query returned more than one row
Listing 6: INTO STRICT producing an exception

In order to make Listing 6 work with STRICT, the query must be changed to explicitly return a single result. We can do this by means of a LIMIT predicate, as shown in Listing 7:

testdb=> DO $code$
DECLARE
file_name text;
file_size numeric;
BEGIN
SELECT f_name, f_size
INTO STRICT file_name, file_size
FROM files WHERE f_type = 'png'
ORDER BY f_name
LIMIT 1; -- to make STRICT happy!

RAISE INFO 'The first png file is %', file_name;
END $code$;

INFO: The first png file is picture1.png
Listing 7: INTO STRICT with a good query

Therefore, the STRICT optional predicate is a way to let the code executor check for a single result row. We can use it to be sure that the query is not discarding results and therefore assigning values that are possibly wrong to variables.

PL/pgSQL interpolates variables only where a value is expected. We cannot use a variable value to reference an identifier, such as the name of a column. As a consequence, DDL and utility statements (such as ALTER TABLE) cannot be executed with variable interpolation. There are a few ways to build dynamic statements where variables interpolate as identifiers, which we will discuss later in this section. With regard to ordinary DML statements, Listing 8 works as expected, because file_type is interpolated in the query:

testdb=> DO $code$
DECLARE
file_name text;
file_size numeric;
size_kb int := 1024;
size_mb numeric := size_kb * size_kb;
size_gb numeric := size_mb * size_mb;
unit text;
BEGIN
-- get the max file size
SELECT max( f_size )
INTO file_size
FROM files;

IF file_size > size_kb THEN
file_size := file_size / size_kb;
unit := 'kB';
ELSIF file_size > size_mb THEN
file_size := file_size / size_mb;
unit := 'MB';
ELSIF file_size > size_gb THEN
file_size := file_size / size_gb;
unit := 'GB';
ELSE
unit := 'bytes';
END IF;
RAISE INFO 'Biggest file size is % %', file_size, unit;
END $code$;

OUTPUT
Biggest file size is 520.1000 bytes
Listing 8:  Variable interpolation example

A problem that can arise when working with variable interpolation in SQL statements is name ambiguity. What happens if the file_type variable that is interpolated in Listing 8 is renamed f_type (which is also the name of the table column)? The interpreter aborts the execution because it does not know which f_type item the code is referring to, as shown in Listing 9:

testdb=> DO $code$ 
DECLARE
f_type text := 'png';
file_size text;
file_name text;
BEGIN
SELECT f_size, f_name
INTO STRICT file_size, file_name
FROM files
WHERE f_type = f_type -- f_type = 'png'
ORDER BY f_size DESC
LIMIT 1;
RAISE INFO 'Biggest % file is %', f_type, file_name;
END $code$;

ERROR: column reference "f_type" is ambiguous

LINE 3:    WHERE  f_type = f_type  -- What is what?
                 ^
DETAIL:  It could refer to either a PL/pgSQL variable or a table column.
Listing 9:  Variable and identifier name clashing

The interpreter cannot understand if the right-hand f_type refers to the variable or to the column itself, so it aborts the execution. While simply renaming the variable would solve the problem, and, in most cases, is the best thing to do, we can also use the variable_conflict pragma, which can be set to the following:

  • error: This is the default behavior. It aborts the execution, as shown in Listing 9.
  • use_variable: This forces the use of the variable as a value.
  • use_column: This forces the use of the column as a value.

It is therefore possible to fix the code snippet in Listing 9, transforming it into Listing 10 and forcing the interpreter to use the right-hand f_type as a variable to interpolate:

testdb=> DO $code$ #variable_conflict use_variable
DECLARE
f_type text := 'png';
file_size text;
file_name text;
BEGIN
SELECT f_size, f_name
INTO STRICT file_size, file_name
FROM files
WHERE f_type = f_type -- f_type = 'png'
ORDER BY f_size DESC
LIMIT 1;
RAISE INFO 'Biggest % file is %', f_type, file_name;
END $code$;

OUTPUT
Biggest .png file is picture2.png.
Listing 10:  Variable and identifier name clashing: the use_variable p ragma

We can also make a variable immutable, so that its value cannot change, using the CONSTANT keyword immediately before the variable type at the time of declaration. A constant variable must be assigned an initial value, or a NULL immutable value will be assigned (but this case does not have many practical uses).

To demonstrate how to use CONSTANT, Listing 11 shows an incorrect snippet of code: since database_name and page_size are declared immutable, the executor will not allow any of the subsequent re-assignments to those variables:

testdb=> DO $code$
DECLARE
page_size CONSTANT int := 8;
database_name CONSTANT text := 'PostgreSQL';
BEGIN
RAISE INFO '% page size is % kilobytes by default', database_name, page_size;

page_size := page_size * 2;
RAISE INFO 'but you can change to % kB at compile time', page_size;
database_name := 'MySQL';
RAISE INFO 'or switch to % if you are unhappy!', database_name;
END $code$;

ERROR: variable "page_size" is declared CONSTANT
LINE 11: page_size := page_size * 2;

ERROR: variable "database_name" is declared CONSTANT
LINE 13: database_name := 'MySQL';
Listing 11:  Constant variables

A variable can be also aliased, which means that it can have different names at once. An alias is defined with the ALIAS FOR clause, indicating the new name and the target value. Consider Listing 12, where a is aliased by b and everything operates on either of the two results:

testdb=> DO $code$
DECLARE
a int := 10;
b ALIAS FOR a;
BEGIN
RAISE INFO 'a is %, b is %', a ,b;
-- will impact 'a'
b := b * 2;
RAISE INFO 'a is %, b is %', a ,b;
END $code$;

OUTPUT
a is 10, b is 10
a is 20, b is 20
Listing 12:  Variable alias

As you can see, creating an alias to a variable simply defines another name that the same variable can be accessed with.