SQL/PSM Manual

Introduction to SQL/PSM (PL/pgPSM)

Already in the early nineties of the 20th century it was obvious that the ANSI SQL standard lacks means for creating stored procedures (especially working with variables and flow control - loops and conditionals). Commercial databases started implementing their own proprietary languages for this purpose. The most popular ones included PL/SQL (Oracle 1992), T-SQL (Sybase and Microsoft, 1995) and SPL (Informix, 1996). From 1990 a group of developers around Jim Melton, part of the the ANSI SQL commission for standards, started to work on this problem. In 1998 a draft of the new standard became part of SQL3 - under the name SQL/PSM (ANSI/ISO/IEC 9075-4:1999). Unfortunately at that time most big companies had their own languages (incompatible with the standard) and refused to abandon them in favor of standard. SQL/PSM was implemented only in those RDBMS, in which there was no support for stored procedures before 1998. Except for DB2 (SQL PL, IBM, 2001) all of them were minor RDBMS: Miner, Solid, 602SQL Server. After 2005 the SQL/PSM standard started to become more popular, when it was implemented in Advantage Database Server (Sybase iAnywhere, 2005), MySQL (2005) and PostgreSQL (2007). Implementation of SQL/PSM is usually incomplete, SQL PL in DB2 is considered to be one of the best implementations. PostgreSQL implementation is called PL/pgPSM (using the standard naming scheme in PostgreSQL).

CREATE OR REPLACE FUNCTION hello(uid integer)
RETURNS varchar AS
$$
  BEGIN
    DECLARE real_name varchar;
    -- Get real name
    SET real_name = (SELECT name || ' ' || surname 
                        FROM Users 
                       WHERE Users.uid = hello.uid);
    RETURN 'Hello, ' || real_name;
  END;
$$ LANGUAGE plpgpsm;

SELECT hello(123);

Tips for developing in PL/pgPSM language

The same tips as for all software development are recommended for code writing in PL/pgPSM. Take a big care about a readable and clear code:

use the recommended coding standard,
use comments and clear variable names,
use a consistent block indent,
do not write a long procedure (max 50 lines).

There are also some extra recommendations for general SQL store procedure languages:

use a prefix and qualified attribute name to avoid attribute and variable name collision,
use consistent exception handling
avoid ISAM programming - use SQL if possible
take care about resources - use limited amount cursors and temporary tables,
do not adjust data in triggers,
the procedures should do concrete tasks, not only encapsulate the selects.

The following two examples show the same task, but the first implementation is considered wrong. Both stored procedures select group of employees and call print_info for each group member.

-- wrong (ISAM usage)
CREATE OR REPLACE FUNCTION report_a()
RETURNS void AS 
$$
  main: FOR outer AS 
            SELECT id FROM Users 
        DO
          BEGIN
            DECLARE name, surname varchar;
            DECLARE age int;
            SET (name, surname, age) = (SELECT e.name, e.surname, e.age
                                           FROM Employers e
                                          WHERE e.id = outer.id);
            IF age >= 20 AND age <= 29 THEN
              CALL print_info(name, surname, age, 'y');
            ELSE IF age >=30 AND age <= 50 THEN
              CALL print_info(name, surname, age, 'o');
            END IF;                     
          END;
        END FOR main;
$$ LANGUAGE plpgpsm;

-- correct
CREATE OR REPLACE FUNCTION report_b()
RETURNS void AS
$$
  main: FOR fc AS
            -- SELECT statement is used for get limited employee selection based on conditions
            SELECT e.*, CASE WHEN e.age BETWEEN 20 AND 29 THEN 'y'
                             WHEN e.age BETWEEN 30 AND 50 THEN 'o' END AS tp
               FROM Employers e
              WHERE e.age BETWEEN 20 AND 50
        DO
          CALL print_info(fc.name, fc.surname, fc.age, fc.tp);
        END FOR main;
$$ LANGUAGE plpgpsm;

PL/pgPSM handbook

Below you can see some examples of valid functions. Each function contains one PL/pgPSM statement.

Note: This could look a little strange, compared with other PL/SQL languages, where the main statement is usually a compound statement. If you do not use any SQL statement inside the PL/pgPSM function, use the function attribute IMMUTABLE. It can, besides other benefits, improve function performance. A compound statement is a special PL/pgPSM statement, which allows you to write unlimited sequences of PL/pgPSM statements.

CREATE OR REPLACE FUNCTION sum2params(IN a integer, IN b integer, OUT c integer) AS 
$$
  SET c = a + b;
$$ LANGUAGE plpgpsm IMMUTABLE;

CREATE OR REPLACE FUNCTION insert_val(a integer)
RETURNS void AS
$$
  INSERT INTO Foo VALUES(a);
$$ LANGUAGE plpgpsm;

CREATE OR REPLACE FUNCTION get_sum(IN a integer, OUT b integer) AS
$$
  SET b = (SELECT sum(f.a) 
              FROM Foo f
             WHERE f.a > get_sum.a);
$$ LANGUAGE plpgpsm;

CREATE OR REPLACE FUNCTION dummy() 
RETURNS void AS
$$
  BEGIN
  END;
$$ LANGUAGE plpgpsm IMMUTABLE;

The PL/PgPSM contains the following statements:

Before starting to read about each statement, go through the following examples:

-- case statement
CREATE OR REPLACE FUNCTION foo1(a integer)
RETURNS void AS $$
  CASE a
    WHEN 1, 3, 5, 7, 9 THEN
      PRINT a, 'is odd number';
    WHEN 2, 4, 6, 8, 10 THEN
      PRINT a. 'is odd number';
    ELSE 
      PRINT a, 'isn't from range 1..10';
  END CASE;
$$ LANGUAGE plpgpsm;

-- while statement
CREATE OR REPLACE FUNCTION foo2(a integer)
RETURNS void AS 
$$
  BEGIN
    DECLARE i integer DEFAULT 1;
    WHILE i <= a 
    DO
      PRINT i;
      SET i = i + 1;
    END WHILE;
  END
$$ LANGUAGE plpgpsm;

-- for statement
CREATE OR REPLACE FUNCTION foo3(a integer)
RETURNS void AS 
$$
  FOR fc AS
      SELECT i 
         FROM generate_series(1,a) AS g(i)
  DO
    PRINT fc.i;
  END FOR;
$$ LANGUAGE plpgpsm;

Function attributes

PL/pgPSM has two function switches: DUMP and RECOMPILE. The first switch leads to the dumping of compiled code into the system log. The second switch invalidates all cached plans related to the function every time when the function is called. These switches must be written before the PL/pgPSM function body together with the keyword# OPTION.

#OPTION DUMP

The aim of this option is to dump the compiled code into the system logfile. It can be very helpful to find collisions of variable names and SQL attributes. In such a case the system warns us, that only useless information appears in compilation time or returns unexpected values.

This switch well demonstrates the fact that PL/pgPSM is mostly a preprocessor of SQL language. The interpreter itself is so minimalistic, that all logic and arithmetic operations are performed by SQL. This feature ensures the compatibility of the types and functions of data with SQL. On the other hand, PL/pgPSM is not suitable for demanding computing tasks (the SQL statement execution cost is considerable). For these kinds of tasks PL/Perl, PL/Python or C procedures are strongly recommended in PostgreSQL.

-- table foo contains column a
CREATE OR REPLACE FUNCTION collision(a integer)
RETURNS SETOF Foo AS
$$
#option dump
  SELECT * 
    FROM Foo 
   WHERE a = a;
$$ LANGUAGE plpgpsm;

This is very tricky bug (and very common). Compiled code follows:

Execution tree of successfully compiled PL/pgSQL function collision(integer):                            
                                                                                                      
Function's data area:                                                                                 
    entry 0: VAR $1               type int4 (typoid 23) atttypmod -1                                  
    entry 1: VAR found            type bool (typoid 16) atttypmod -1                                  
                                                                                                      
Function's statements:                                                                                
  0: *unnamed*:                                                                                       
     BLOCK                                                                                            
  2:   SQL 'SELECT * FROM Foo WHERE  $1  =  $1  {$1=0}'                                               
  0:   RETURN NULL                                                                                    
     END *unnamed*                                                                                    
                                                                                                      
End of execution tree of function collision(integer)

Error is in the WHERE condition. PL/pgPSM understands this condition as comparison of variable "a" with itself instead of comparison between variable "a" and the attribute "a". Because of this, the condition is true for all rows, and the result is the whole Foo table.

#OPTION RECOMPILE

This option invokes procedure recompilation before each execution. It decreases execution plan non-consistency, but on the other hand it increases time execution of procedure start (compilation is performed) and time of execution (new plan is generated each time). It is the reason why this option should be used very carefully. How to avoid usage of this option is described in section dedicated to temporary tables (Usage of temporary tables in PL/pgPSM). RECOMPILE option is designed for easier porting of stored procedures from other RDBMS, where storage of execution plan has different behavior. This option is useless in PostgreSQL 8.3. and isn't supported there.

CREATE OR REPLACE FUNCTION testr()
RETURNS SETOF Foo AS
$$
#option dump recompile
  BEGIN
    DROP TABLE IF EXISTS FooG;
    INSERT INTO FooG
       SELECT * 
          FROM Foo;
    RETURN TABLE(SELECT *
                    FROM FooG);
  END;
$$ LANGUAGE plpgpsm;

This option is analogous to the option WITH RECOMPILE in T-SQL in Microsoft SQL Server.

Porting stored procedures from MySQL 5.x

Original source code of a procedure that generates fractals was taken from the MySQL dev. archive. PostgreSQL solves following things different way though:

change of the actual scheme,
the way of registering of the procedure source code (the separator mechanism is not used),
for joining of strings is used operator || (not the function concat),
it doesn't support use of open SQL statements, because it doesn't support procedures, but only functions.

Notice the minimal differences in the code modified for PostgreSQL.

PL/pgPSM installation

Because the implementation of the PL/pgPSM is relatively new, this interpreter isn't included in the distribution. So far, it's available only on the PostgreSQL addons development pages http://pgfoundry.org. Integration into the distribution would be possible after the full implementation of standard and thorough testing. The PL/pgPSM is however usable right now if you can live with above. PL/pgPSM interpreter is the modification of PL/pgSQL interpreter, which has been thoroughly checked by the thousands of projects within the years.

For the compilation of PL/pgPSM, PostgreSQL has to be installed from the source codes At least version 8.2 is required. The latest version of PL/pgPSM can be found on http://pgfoundry.org/frs/?group_id=1000238&release_id=767.

Installation instructions

Download the latest source codes of PL/pgPSM from the http://pgfoundry.org/frs/?group_id=1000238 directory.
If you have PostgreSQL installed from the source codes, unpack the archive into contrib directory.
Execute make and make install commands.
Run psql postgres < plpgpsm.sql (must be logged in as root user).
From this moment plpgpsm can be used the same way as other languages.
Installation test: make installcheck.

If you don't have PostgreSQL installed from the source codes, you need to have at least PostgreSQL development libraries.

Unpack the plpgpsm archive somewhere.
Execute make USE PGXS=1 and make USE_PGXS=1 install commands
The rest is same as in previous list.

We would appreciate any form of cooperation. That is either the completion of this documentation, it's correction, translation, extension of the tests, completion of it's functionality or simply it's usage.