- PostgreSQL 11 Server Side Programming Quick Start Guide
- Luca Ferrari
- 953字
- 2025-02-26 20:39:23
Immutability
What happens when a function is called over and over again with the same arguments? It is executed over and over again, which is quite easy to understand. PostgreSQL allows a function to declare a level of immutability so that the query executor can make some assumptions to improve the speed of the execution.
To better understand this concept, consider the f_human_file_size() function shown in Listing 46. The function accepts a numeric argument and produces a text representation of that argument:
testdb=> SELECT f_human_file_size( 1234567890 );
f_human_file_size
-------------------
1.15MB
Reading the function implementation, it is clear that such a function will provide the same result for the same input every time and it would produce a huge speed boost if the query executor understood that. It could cache the result for a specified input and then avoid the execution of the function for that input. Although, for a function like f_human_file_size() the performance improvement is small, for other much more complex functions it might be an important gain.
How can the executor decide whether to avoid a function execution under the same input-value circumstances? It does so by considering a property of the function itself. A function can be declared to be either VOLATILE, STABLE, or IMMUTABLE. The default immutability property, when none is specified, is VOLATILE. This means the function must be executed every time, even if the input values are always the same. Since no immutability property has been specified in the definition of Listing 46, the function is treated as a VOLATILE one. On the contrary, if the function never changes its output value when the same input is provided and does not interact with the underlying database (to select or update existing tuples), the function can be declared IMMUTABLE. This is the case for f_human_file_size().
Between VOLATILE and IMMUTABLE, there is an intermediate level named STABLE. A STABLE function is not supposed to modify the underlying database, but nor does it provide the same output for the same input within the same SQL statement every time. In other words, the output of a STABLE function can change between different SQL statements or transactions, even if the same input is provided.
We can choose the right level by following the following rules:
- If the function either modifies the underlying database or returns different output values for the same input values, it must be marked as VOLATILE.
- If the function does not change its output value for the same input values, it can be either a STABLE or IMMUTABLE one:
- If the output depends only on the input arguments and therefore the function does not interact at all with the underlying database, it can be marked as IMMUTABLE.
- If the output depends on the database status (such as a lookup table) and does not change within a single SQL statement, the function can be marked as STABLE.
Be aware that marking a function with the wrong level can lead to incorrect execution and unexpected results, so be careful when dealing with the previously mentioned levels.
In the IMMUTABLE family of functions, we have mathematical functions such as sin() that provide a result depending on the input value only. The STABLE family of functions include the timestamp set of functions (such as now() and current_timestamp); their output does not depend on any input value but does not change within a transaction boundary, meaning it can change across SQL statements outside of a transaction boundary. The VOLATILE family of functions include random() and timeofday(), the output of which does not depend on the underlying database status or any input argument, but changes each time the function is invoked.
We can improve the f_human_file_size() function by specifying that it can be considered IMMUTABLE, as shown in Listing 57.
testdb=> CREATE OR REPLACE FUNCTION
f_human_file_size( f_size numeric )
RETURNS text AS $code$
DECLARE
size_kb CONSTANT int := 1024 * 1024;
size_mb CONSTANT bigint := size_kb * 1024;
size_gb CONSTANT bigint := size_mb * 1024;
unit text := 'bytes';
BEGIN
IF f_size > size_gb THEN
f_size := f_size / size_gb;
unit := 'MB';
ELSEIF f_size > size_mb THEN
f_size := f_size / size_mb;
unit := 'MB';
ELSEIF f_size > size_kb THEN
f_size := f_size / size_kb;
unit := 'KB';
ELSE
unit := 'bytes';
END IF;
RETURN round( f_size, 2 ) || unit;
END $code$ LANGUAGE plpgsql IMMUTABLE;
On the other hand, the f_files_from_directory() function defined in Listing 54 cannot be declared either STABLE or IMMUTABLE even if it seems that it does not depend on or interact with the underlying database and that its output only depends on the input value. On a closer look, it is possible to see that the output of the function depends on another function, nextval(), which returns a different value each time and is marked as VOLATILE. This therefore means that the f_files_from_directory() function cannot be anything but VOLATILE, because it depends on another VOLATILE function (which, in turn, depends on the underlying database status). Consider what would happen if the function were marked as non-volatile: its results can't remain the same since the function depends on an external resource (in this case the filesystem), which can change by itself.
PostgreSQL allows functions to be used as index expressions, but only IMMUTABLE functions can be used for this purpose. It is clear that the index must be built depending on the return values of the function, and these must be the same for the same input values or the index will not be able to cover the underlying table values.