- PostgreSQL 11 Server Side Programming Quick Start Guide
- Luca Ferrari
- 1963字
- 2025-02-26 20:39:18
Variables and variable assignment
A variable is a placeholder for a value and is identified by a valid name and a type. Variable types are those that PostgreSQL allows within a table definition, including user-defined types. A variable can either hold a single value or multiple values of the same type in an array. If it holds a single value, such as an integer, it is called a scalar.
A variable can also handle more complex types, such as tuples, which represent table rows or query results. These variables are called records. There are two specific types of record variable: record and rowtype. The record type is a general abstract type that can handle any specific tuple structure, while rowtype can handle well-defined tuple structures and is therefore tied to a specific result set structure. In other words, a record variable can interchangeably hold a tuple coming from the files table or the tags table, while a rowtype variable must be associated with a particular table. As they are complex structures, both the record and rowtype variables must be de-referenced to obtain a single value (such as a column value). This is done using a . symbol followed by the name of the field to be accessed. More details about these two types of variables will be discussed later.
We can assign values to a variable in two ways:
- The := assignment operator (or its equivalent shorter form, =)
- The INTO predicate in conjunction with an SQL statement that returns tuples
The := assignment operator works for an inline single value assignment of either a scalar or complex type. The INTO predicate, on the other hand, can be used to perform multiple assignments in a single statement, such as assigning columns of a row to separate scalar variables. If required, the assigned value can be coerced through a cast to the type of the variable it is assigned to.
The INTO predicate is mainly used in conjunction with the SELECT statement, allowing for the assignment of multiple values or the assignment of values extracted from a query, such as a row. We can also use the INTO predicate with any statement that returns a tuple. These statements are INSERT, UPDATE, or DELETE with the RETURNING clause.
As shown in Listing 3, variables are first assigned one at a time via the := operator. Then, a single SELECT INTO statement assigns different values at the i and t variables at the same time:
testdb=> DO $code$
DECLARE
i int; -- initial value is NULL
t text := 'Hello World!';
d date := CURRENT_DATE;
BEGIN
RAISE INFO 'Variables are: d = [%], t = [%], i = [%]', d, t, i;
-- single assignment
i := length( t );
d := now(); -- coerced to 'date'
-- multiple assignment!
SELECT 'Happy new ' || EXTRACT( year FROM d ) + 1 || ' year!', i * i
INTO t, i;
RAISE INFO 'Variables now are: d = [%], t = [%], i = [%]', d, t, i;
END $code$;
INFO: Variables are: d = [2018-07-12], t = [Hello World!], i = [<NULL>]
INFO: Variables now are: d = [2018-07-12], t = [Happy new 2019 year!], i = [144]
Under the hood, the := operator implies the execution of a SELECT statement to assign the value. Listing 4 shows how to use the INTO predicate with both an INSERT statement and a SELECT statement:
testdb=> DO $code$
DECLARE
file_name text := 'TestFile.txt';
file_size numeric := 123.56;
file_pk int;
file_ts timestamp;
BEGIN
-- insert a new file and get back the key
INSERT INTO files( f_name, f_size )
VALUES ( file_name, file_size )
RETURNING pk
INTO file_pk;
RAISE INFO 'File [%] has been inserted with pk = %', file_name, file_pk;
-- query back to get the ts column
SELECT ts INTO file_ts
FROM files WHERE pk = file_pk;
RAISE INFO 'Timestamp is %', file_ts;
END $code$;
OUTPUT
File [TestFile.txt] has been inserted with pk = 27
Timestamp is 2018-07-12 10:56:17.600594
What happens if there is more than one row that is subject to assignment? This depends on the statement that performs the assignment: in the case of an INSERT, UPDATE, or DELETE statement with a RETURNING clause, an error is thrown. This means that only one tuple at a time can be inserted or modified while being targeted at a variable assignment. In the case of a SELECT statement, it is possible to get more than one row as a result, but only the first tuple is considered as part of the assignment and all subsequent tuples are discarded.
As an alternative, we can use the INTO STRICT optional form, which imposes a single row returned from a SELECT statement and therefore ensures that exactly one row value is assigned, resulting in an error if more than one row is returned by the query. The code snippet of Listing 5 works, but throws away all rows except the first one:
testdb=> DO $code$
DECLARE
file_name text;
file_size numeric;
BEGIN
SELECT f_name, f_size
INTO file_name, file_size
FROM files WHERE f_type = 'png'
ORDER BY f_name;
RAISE INFO 'The first png file is %', file_name;
END $code$;
INFO: The first png file is picture1.png
The code of Listing 6, however, does not work because STRICT throws an exception since the query returns more than one row:
testdb=> DO $code$
DECLARE
file_name text;
file_size numeric;
BEGIN
SELECT f_name, f_size
INTO STRICT file_name, file_size
FROM files WHERE f_type = 'png'
ORDER BY f_name;
RAISE INFO 'The first png file is %', file_name;
END $code$;
ERROR: query returned more than one row
In order to make Listing 6 work with STRICT, the query must be changed to explicitly return a single result. We can do this by means of a LIMIT predicate, as shown in Listing 7:
testdb=> DO $code$
DECLARE
file_name text;
file_size numeric;
BEGIN
SELECT f_name, f_size
INTO STRICT file_name, file_size
FROM files WHERE f_type = 'png'
ORDER BY f_name
LIMIT 1; -- to make STRICT happy!
RAISE INFO 'The first png file is %', file_name;
END $code$;
INFO: The first png file is picture1.png
Therefore, the STRICT optional predicate is a way to let the code executor check for a single result row. We can use it to be sure that the query is not discarding results and therefore assigning values that are possibly wrong to variables.
PL/pgSQL interpolates variables only where a value is expected. We cannot use a variable value to reference an identifier, such as the name of a column. As a consequence, DDL and utility statements (such as ALTER TABLE) cannot be executed with variable interpolation. There are a few ways to build dynamic statements where variables interpolate as identifiers, which we will discuss later in this section. With regard to ordinary DML statements, Listing 8 works as expected, because file_type is interpolated in the query:
testdb=> DO $code$
DECLARE
file_name text;
file_size numeric;
size_kb int := 1024;
size_mb numeric := size_kb * size_kb;
size_gb numeric := size_mb * size_mb;
unit text;
BEGIN
-- get the max file size
SELECT max( f_size )
INTO file_size
FROM files;
IF file_size > size_kb THEN
file_size := file_size / size_kb;
unit := 'kB';
ELSIF file_size > size_mb THEN
file_size := file_size / size_mb;
unit := 'MB';
ELSIF file_size > size_gb THEN
file_size := file_size / size_gb;
unit := 'GB';
ELSE
unit := 'bytes';
END IF;
RAISE INFO 'Biggest file size is % %', file_size, unit;
END $code$;
OUTPUT
Biggest file size is 520.1000 bytes
A problem that can arise when working with variable interpolation in SQL statements is name ambiguity. What happens if the file_type variable that is interpolated in Listing 8 is renamed f_type (which is also the name of the table column)? The interpreter aborts the execution because it does not know which f_type item the code is referring to, as shown in Listing 9:
testdb=> DO $code$
DECLARE
f_type text := 'png';
file_size text;
file_name text;
BEGIN
SELECT f_size, f_name
INTO STRICT file_size, file_name
FROM files
WHERE f_type = f_type -- f_type = 'png'
ORDER BY f_size DESC
LIMIT 1;
RAISE INFO 'Biggest % file is %', f_type, file_name;
END $code$;
ERROR: column reference "f_type" is ambiguous
LINE 3: WHERE f_type = f_type -- What is what? ^ DETAIL: It could refer to either a PL/pgSQL variable or a table column.
The interpreter cannot understand if the right-hand f_type refers to the variable or to the column itself, so it aborts the execution. While simply renaming the variable would solve the problem, and, in most cases, is the best thing to do, we can also use the variable_conflict pragma, which can be set to the following:
- error: This is the default behavior. It aborts the execution, as shown in Listing 9.
- use_variable: This forces the use of the variable as a value.
- use_column: This forces the use of the column as a value.
It is therefore possible to fix the code snippet in Listing 9, transforming it into Listing 10 and forcing the interpreter to use the right-hand f_type as a variable to interpolate:
testdb=> DO $code$ #variable_conflict use_variable
DECLARE
f_type text := 'png';
file_size text;
file_name text;
BEGIN
SELECT f_size, f_name
INTO STRICT file_size, file_name
FROM files
WHERE f_type = f_type -- f_type = 'png'
ORDER BY f_size DESC
LIMIT 1;
RAISE INFO 'Biggest % file is %', f_type, file_name;
END $code$;
OUTPUT
Biggest .png file is picture2.png.
We can also make a variable immutable, so that its value cannot change, using the CONSTANT keyword immediately before the variable type at the time of declaration. A constant variable must be assigned an initial value, or a NULL immutable value will be assigned (but this case does not have many practical uses).
To demonstrate how to use CONSTANT, Listing 11 shows an incorrect snippet of code: since database_name and page_size are declared immutable, the executor will not allow any of the subsequent re-assignments to those variables:
testdb=> DO $code$
DECLARE
page_size CONSTANT int := 8;
database_name CONSTANT text := 'PostgreSQL';
BEGIN
RAISE INFO '% page size is % kilobytes by default', database_name, page_size;
page_size := page_size * 2;
RAISE INFO 'but you can change to % kB at compile time', page_size;
database_name := 'MySQL';
RAISE INFO 'or switch to % if you are unhappy!', database_name;
END $code$;
ERROR: variable "page_size" is declared CONSTANT
LINE 11: page_size := page_size * 2;
ERROR: variable "database_name" is declared CONSTANT
LINE 13: database_name := 'MySQL';
A variable can be also aliased, which means that it can have different names at once. An alias is defined with the ALIAS FOR clause, indicating the new name and the target value. Consider Listing 12, where a is aliased by b and everything operates on either of the two results:
testdb=> DO $code$
DECLARE
a int := 10;
b ALIAS FOR a;
BEGIN
RAISE INFO 'a is %, b is %', a ,b;
-- will impact 'a'
b := b * 2;
RAISE INFO 'a is %, b is %', a ,b;
END $code$;
OUTPUT
a is 10, b is 10
a is 20, b is 20
As you can see, creating an alias to a variable simply defines another name that the same variable can be accessed with.