Wednesday, January 14, 2015

Performance Tuning PostgreSQL

It's a broad topic, so here's lots of stuff for you to read up on:
  • EXPLAIN and EXPLAIN ANALYZE is extremely useful for understanding what's going on in your db-engine
  • Make sure relevant columns are indexed
  • Make sure irrelevant columns are not indexed (insert/update-performance can go down the drain if too many indexes must be updated)
  • Make sure your postgres.conf is tuned properly
  • Know what work_mem is, and how it affects your queries (mostly useful for larger queries)
  • Make sure your database is properly normalized
  • VACUUM for clearing out old data
  • ANALYZE for updating statistics (statistics target for amount of statistics)
  • Persistent connections (you could use a connection manager like pgpool or pgbouncer)
  • Understand how queries are constructed (joins, sub-selects, cursors)
  • Caching of data (i.e. memcached) is an option

And when you've exhausted those options: add more memory, faster disk-subsystem etc. Hardware matters, especially on larger datasets.
And of course, read all the other threads on postgres/databases.

Tuesday, January 13, 2015

PostgreSQL cursors

Question:

I am trying to learn basic cursors using PostgreSQL. This is my script:
DECLARE cur_employees CURSOR FOR
  SELECT *
  FROM   employee

CLOSE cur_employees 
I want to traverse the list and output those which are active. Where should I start?
 
 
Answer:
 

You very rarely want to explicitly use cursors at all in PostgreSQL, even when processing query results in plpgsql. This is a marked contrast from many other SQL databases where they are used almost all the time.
In plpgsql you could simply write something like:
DECLARE
  employee%rowtype emp;
BEGIN
  FOR emp IN SELECT * FROM employee LOOP
    IF emp.active THEN
      RAISE INFO 'Active: %', emp.employee_id
    END IF;
  END LOOP;
END
In the above, the plpgsql language handler deals with opening, binding, fetching and closing itself.
With PostgreSQL from 9.0, you can simply drop into executing plpgsql using a "DO" block. For prior versions, you need to create a function and select it. If you're looking for the PostgreSQL equivalent of, for example, iterating through a result with a cursor on SQL Server, that's what it is. Note that iteration etc. are not part of the SQL dialect, only part of plpgsql (or any of the other emebedded languages).
The "DECLARE CURSOR xxx" syntax at the SQL level can be used like this:
DECLARE cur_employees CURSOR FOR SELECT * FROM employee;
FETCH NEXT FROM cur_employees;
// etc..
CLOSE cur_employees;
This can be used to carefully get just part of the query result set. However, it's unusual to use these, since usually your client driver will provide some sort of functionality to do this (e.g. scrollable result sets in JDBC). You can also return cursors from functions similar to Oracle, although again that's a comparatively rare use case.