La PGConf US 2018 aura lieu à Jersey City (New Jersey, États-Unis) du 16 au 20 avril 2018 : https://pgconf.org/conferences/2018

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en mars

PostgreSQL Local

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20180318221506.GA24540@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Álvaro Herrera pushed:

Peter Eisentraut pushed:

Michael Meskes pushed:

Tom Lane pushed:

  • Avoid holding AutovacuumScheduleLock while rechecking table statistics. In databases with many tables, re-fetching the statistics takes some time, so that this behavior seriously decreases the available concurrency for multiple autovac workers. There's discussion afoot about more complete fixes, but a simple and back-patchable amelioration is to claim the table and release the lock before rechecking stats. If we find out there's no longer a reason to process the table, re-taking the lock to un-claim the table is cheap enough. (This patch is quite old, but got lost amongst a discussion of more aggressive fixes. It's not clear when or if such a fix will be accepted, but in any case it'd be unlikely to get back-patched. Let's do this now so we have some improvement for the back branches.) In passing, make the normal un-claim step take AutovacuumScheduleLock not AutovacuumLock, since that is what is documented to protect the wi_tableoid field. This wasn't an actual bug in view of the fact that readers of that field hold both locks, but it creates some concurrency penalty against operations that need only AutovacuumLock. Back-patch to all supported versions. Jeff Janes Discussion: https://postgr.es/m/26118.1520865816@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/38f7831d703be98aaece8af6625faeab5123a02c
  • When updating reltuples after ANALYZE, just extrapolate from our sample. The existing logic for updating pg_class.reltuples trusted the sampling results only for the pages ANALYZE actually visited, preferring to believe the previous tuple density estimate for all the unvisited pages. While there's some rationale for doing that for VACUUM (first that VACUUM is likely to visit a very nonrandom subset of pages, and second that we know for sure that the unvisited pages did not change), there's no such rationale for ANALYZE: by assumption, it's looked at an unbiased random sample of the table's pages. Furthermore, in a very large table ANALYZE will have examined only a tiny fraction of the table's pages, meaning it cannot slew the overall density estimate very far at all. In a table that is physically growing, this causes reltuples to increase nearly proportionally to the change in relpages, regardless of what is actually happening in the table. This has been observed to cause reltuples to become so much larger than reality that it effectively shuts off autovacuum, whose threshold for doing anything is a fraction of reltuples. (Getting to the point where that would happen seems to require some additional, not well understood, conditions. But it's undeniable that if reltuples is seriously off in a large table, ANALYZE alone will not fix it in any reasonable number of iterations, especially not if the table is continuing to grow.) Hence, restrict the use of vac_estimate_reltuples() to VACUUM alone, and in ANALYZE, just extrapolate from the sample pages on the assumption that they provide an accurate model of the whole table. If, by very bad luck, they don't, at least another ANALYZE will fix it; in the old logic a single bad estimate could cause problems indefinitely. In HEAD, let's remove vac_estimate_reltuples' is_analyze argument altogether; it was never used for anything and now it's totally pointless. But keep it in the back branches, in case any third-party code is calling this function. Per bug #15005. Back-patch to all supported branches. David Gould, reviewed by Alexander Kuzmenkov, cosmetic changes by me Discussion: https://postgr.es/m/20180117164916.3fdcf2e9@engels https://git.postgresql.org/pg/commitdiff/d04900de7d0cb5b6ecb6d5bf9fdb6f3105824f81
  • Clean up duplicate role and schema names in regression tests. Since these names are global, using the same ones in different regression tests creates a hazard of test failures if any two such scripts run concurrently. Let's establish a policy of not doing that. In the cases where a conflict existed, I chose to rename both sides: in principle one script or the other could've been left in possession of the common name, but that seems to just invite more trouble of the same sort. There are a number of places where scripts are using names that seem unduly generic, but in the absence of actual conflicts I left them alone. In addition, fix insert.sql's use of "someone_else" as a role name. That's a flat out violation of longstanding project policy, so back-patch that change to v10 where the usage appeared. The rest of this is just future-proofing, as no two of these scripts are actually run concurrently in the existing parallel_schedule. Conflicts of schema-qualified names also exist, but will be dealt with separately. Discussion: https://postgr.es/m/4627.1521070268@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/fb7db40ad26c3384f81d471442743076ade3f82a
  • Clean up duplicate table and function names in regression tests. Many of the objects we create during the regression tests are put in the public schema, so that using the same names in different regression tests creates a hazard of test failures if any two such scripts run concurrently. This patch cleans up a bunch of latent hazards of that sort, as well as two live hazards. The current situation in this regard is far worse than it was a year or two back, because practically all of the partitioning-related test cases have reused table names with enthusiasm. I despaired of cleaning up that mess within the five most-affected tests (create_table, alter_table, insert, update, inherit); fortunately those don't run concurrently. Other than partitioning problems, most of the issues boil down to using names like "foo", "bar", "tmp", etc, without thought for the fact that other test scripts might use similar names concurrently. I've made an effort to make all such names more specific. One of the live hazards was that commit 7421f4b8 caused with.sql to create a table named "test", conflicting with a similarly-named table in alter_table.sql; this was exposed in the buildfarm recently. The other one was that join.sql and transactions.sql both create tables named "foo" and "bar"; but join.sql's uses of those names date back only to December or so. Since commit 7421f4b8 was back-patched to v10, back-patch a minimal fix for that problem. The rest of this is just future-proofing. Discussion: https://postgr.es/m/4627.1521070268@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/2cf8c7aa48559699f0607f5cb77b661156ad9750
  • Fix msvc/ecpg_regression.proj for recent ECPG test additions. Commit 3b7ab4380 added some tests that require ecpg to be given the new "-C ORACLE" switch. Teach the MSVC build infrastructure about that. Michael Paquier Discussion: https://postgr.es/m/8299.1521154647@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/84a3611cccfa41f29d14a5b1ab4bcc58fba9b25b
  • Mop-up for letting VOID-returning SQL functions end with a SELECT. Part of the intent in commit fd1a421fe was to allow SQL functions that are declared to return VOID to contain anything, including an unrelated final SELECT, the same as SQL-language procedures can. However, the planner's inlining logic didn't get that memo. Fix it, and add some regression tests covering this area, since evidently we had none. In passing, clean up some typos in comments in create_function_3.sql, and get rid of its none-too-safe assumption that DROP CASCADE notice output is immutably ordered. Per report from Prabhat Sahu. Discussion: https://postgr.es/m/CANEvxPqxAj6nNHVcaXxpTeEFPmh24Whu+23emgjiuKrhJSct0A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/877cdf11eaa9cabcb9b1e3c1bef0760fe08efdc3
  • Doc: explicitly point out that enum values can't be dropped. This was not stated in so many words anywhere. Document it to make clear that it's a design limitation and not just an oversight or documentation omission. Discussion: https://postgr.es/m/152089733343.1222.6927268289645380498@wrigleys.postgresql.org https://git.postgresql.org/pg/commitdiff/013c0baaddd9df7e1f940184a8ab8e478214b46c
  • Fix query-lifespan memory leakage in repeatedly executed hash joins. ExecHashTableCreate allocated some memory that wasn't freed by ExecHashTableDestroy, specifically the per-hash-key function information. That's not a huge amount of data, but if one runs a query that repeats a hash join enough times, it builds up. Fix by arranging for the data in question to be kept in the hashtable's hashCxt instead of leaving it "loose" in the query-lifespan executor context. (This ensures that we'll also clean up anything that the hash functions allocate in fn_mcxt.) Per report from Amit Khandekar. It's been like this forever, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAJ3gD9cFofAWGvcxLOxDHC=B0hjtW8yGmUsF2hdGh97CM38=7g@mail.gmail.com https://git.postgresql.org/pg/commitdiff/9e17bdb8a525ff89c4535cd153dc0f2fa813ea59
  • Fix WHERE CURRENT OF when the referenced cursor uses an index-only scan. "UPDATE/DELETE WHERE CURRENT OF cursor_name" failed, with an error message like "cannot extract system attribute from virtual tuple", if the cursor was using a index-only scan for the target table. Fix it by digging the current TID out of the indexscan state. It seems likely that the same failure could occur for CustomScan plans and perhaps some FDW plan types, so that leaving this to be treated as an internal error with an obscure message isn't as good an idea as it first seemed. Hence, add a bit of heaptuple.c infrastructure to let us deliver a more on-topic message. I chose to make the message match what you get for the case where execCurrentOf can't identify the target scan node at all, "cursor "foo" is not a simply updatable scan of table "bar"". Perhaps it should be different, but we can always adjust that later. In the future, it might be nice to provide hooks that would let custom scan providers and/or FDWs deal with this in other ways; but that's not a suitable topic for a back-patchable bug fix. It's been like this all along, so back-patch to all supported branches. Yugo Nagata and Tom Lane Discussion: https://postgr.es/m/20180201013349.937dfc5f.nagata@sraoss.co.jp https://git.postgresql.org/pg/commitdiff/8f5ac440430ab1d4b0309a76df278caa87018beb
  • Fix overflow handling in plpgsql's integer FOR loops. The test to exit the loop if the integer control value would overflow an int32 turns out not to work on some ICC versions, as it's dependent on the assumption that the compiler will execute the code as written rather than "optimize" it. ICC lacks any equivalent of gcc's -fwrapv switch, so it was optimizing on the assumption of no integer overflow, and that breaks this. Rewrite into a form that in fact does not do any overflowing computations. Per Tomas Vondra and buildfarm member fulmar. It's been like this for a long time, although it was not till we added a regression test case covering the behavior (in commit dd2243f2a) that the problem became apparent. Back-patch to all supported versions. Discussion: https://postgr.es/m/50562fdc-0876-9843-c883-15b8566c7511@2ndquadrant.com https://git.postgresql.org/pg/commitdiff/2dbee9f19f6c6ac6c013f668611492486e623807
  • Doc: note that statement-level view triggers require an INSTEAD OF trigger. If a view lacks an INSTEAD OF trigger, DML on it can only work by rewriting the command into a command on the underlying base table(s). Then we will fire triggers attached to those table(s), not those for the view. This seems appropriate from a consistency standpoint, but nowhere was the behavior explicitly documented, so let's do that. There was some discussion of throwing an error or warning if a statement trigger is created on a view without creating a row INSTEAD OF trigger. But a simple implementation of that would result in dump/restore ordering hazards. Given that it's been like this all along, and we hadn't heard a complaint till now, a documentation improvement seems sufficient. Per bug #15106 from Pu Qun. Back-patch to all supported branches. Discussion: https://postgr.es/m/152083391168.1215.16892140713507052796@wrigleys.postgresql.org https://git.postgresql.org/pg/commitdiff/a4678320471380e5159a8d6e89466d74d6ee1739

Robert Haas pushed:

  • Let Parallel Append over simple UNION ALL have partial subpaths. A simple UNION ALL gets flattened into an appendrel of subquery RTEs, but up until now it's been impossible for the appendrel to use the partial paths for the subqueries, so we can implement the appendrel as a Parallel Append but only one with non-partial paths as children. There are three separate obstacles to removing that limitation. First, when planning a subquery, propagate any partial paths to the final_rel so that they are potentially visible to outer query levels (but not if they have initPlans attached, because that wouldn't be safe). Second, after planning a subquery, propagate any partial paths for the final_rel to the subquery RTE in the outer query level in the same way we do for non-partial paths. Third, teach finalize_plan() to account for the possibility that the fake parameter we use for rescan signalling when the plan contains a Gather (Merge) node may be propagated from an outer query level. Patch by me, reviewed and tested by Amit Khandekar, Rajkumar Raghuwanshi, and Ashutosh Bapat. Test cases based on examples by Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CA+Tgmoa6L9A1nNCk3aTDVZLZ4KkHDn1+tm7mFyFvP+uQPS7bAg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/0927d2f46ddd4cf7d6bf2cc84b3be923e0aedc52
  • Pass additional arguments to a couple of grouping-related functions. get_number_of_groups() and make_partial_grouping_target() currently fish information directly out of the PlannerInfo; in the former case, the target list, and in the latter case, the HAVING qual. This works fine if there's only one grouping relation, but if the pending patch for partition-wise aggregate gets committed, we'll have multiple grouping relations and must therefore use appropriately translated versions of these values for each one. To make that simpler, pass the values to be used as arguments. Jeevan Chalke. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, Rafia Sabih, and me. Discussion: http://postgr.es/m/CAM2+6=UqFnFUypOvLdm5TgC+2M=-E0Q7_LOh0VDFFzmk2BBPzQ@mail.gmail.com Discussion: http://postgr.es/m/CAM2+6=W+L=C4yBqMrgrfTfNtbtmr4T53-hZhwbA2kvbZ9VMrrw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/648a6c7bd815f98b35709bd56f9f1ca276b33ae6
  • Split create_grouping_paths into degenerate and non-degenerate cases. There's no functional change here, or at least I hope there isn't, just code rearrangement. The rearrangement is motivated by partition-wise aggregate, which doesn't need to consider the degenerate case but wants to reuse the logic for the ordinary case. Based loosely on a patch from Ashutosh Bapat and Jeevan Chalke, but I whacked it around pretty heavily. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, Rafia Sabih, and me. Discussion: http://postgr.es/m/CAFjFpRewpqCmVkwvq6qrRjmbMDpN0CZvRRzjd8UvncczA3Oz1Q@mail.gmail.com https://git.postgresql.org/pg/commitdiff/1466bcfa4a83ba88b55cfcebb99080800627613a

Andres Freund pushed:

Stephen Frost pushed:

Magnus Hagander pushed:

  • Fix pg_recvlogical for pre-10 versions. In e170b8c8, protection against modified search_path was added. However, PostgreSQL versions prior to 10 does not accept SQL commands over a replication connection, so the protection would generate a syntax error. Since we cannot run SQL commands on it, we are also not vulnerable to the issue that e170b8c8 fixes, so we can just skip this command for older versions. Author: Michael Paquier <michael@paquier.xyz> https://git.postgresql.org/pg/commitdiff/8d2814f274def85f39fbe997d454b01628cb5667

Correctifs en attente

Ashutosh Bapat sent in another revision of a patch to implement an advanced partition matching algorithm for partition-wise join.

Amit Khandekar sent in another revision of a patch to fix a concurrency bug in UPDATE of partition-key.

Ildus Kurbangaliev sent in another revision of a patch to support a prefix operator for text with SP-GiST indexes.

Nikita Glukhov sent in another revision of a patch to add a JSONB TRANSFORM for PL/PythonU.

Kyotaro HORIGUCHI sent in two more revisions of a patch to protect syscache from bloating with negative cache entries.

Anthony Bykov and Nikita Glukhov traded patches to add a JSONB TRANSFORM for PL/Perl.

John Naylor sent in two more revisions of a patch to rationalize how bootstrap data is handled.

Ashutosh Bapat, Jeevan Chalke, and Robert Haas traded patches to implement partition-wise aggregation/grouping.

Amul Sul sent in another revision of a patch to restrict concurrent update/delete with UPDATE of partition key.

Takayuki Tsunakawa sent in another revision of a patch to reset the temp schema in order to unblock autovacuum, the blocking of which could lead to XID wraparound.

Anastasia Lubennikova sent in another revision of a patch to use base backup exclusion filters to reduce data transferred with pg_rewind.

Oleg Ivanov sent in another revision of a patch to implement generic WAL compression.

David Steele sent in another revision of a patch to exclude temporary relations from base backup.

Peter Eisentraut sent in a patch to deal with some of the fallout from changing to stdbool.h.

David Steele sent in another revision of a patch to add regression tests for reinit.c.

Andrew Dunstan sent in two more revisions of a patch to make ALTER TABLE ... ADD COLUMN ... DEFAULT [not null] work faster.

Amit Kapila sent in another revision of a patch to implement predicate locking for hash indexes.

Ashutosh Bapat sent in another revision of a patch to fix an infelicity between parallel append and a simple UNION ALL.

Amit Langote sent in two more revisions of a patch to lay down infrastructure for faster partition pruning.

Robert Haas sent in another revision of a patch to enable parallel seq scan for slow functions.

Nathan Bossart sent in another revision of a patch to enable pg_upgrade to clusters with a different WAL segment size.

Ildus Kurbangaliev sent in another revision of a patch to implement custom compression methods.

Thomas Munro sent in a patch to fix a performance regression in REFRESH MATERIALIZED VIEW CONCURRENTLY.

Pavan Deolasee and Fabien COELHO traded patches to make inserts with mostly-monotonically increasing values faster.

Amit Langote sent in another revision of a patch to fix a bug where inserts into partitioned table could cause a crash.

David Steele sent in another revision of a patch to allow a configurable file mode mask.

Michaël Paquier sent in another revision of a patch to fix an issue that manifested as cascaded standby unable to start after a clean shutdown.

Nikita Glukhov sent in another revision of a patch to implement JSON_TABLE from SQL/JSON.

Nikita Glukhov sent in another revision of a patch to implement json_path from SQL/JSON.

Konstantin Knizhnik sent in a patch to make the implementation of WaitLatchOrSocket more efficient.

Kyotaro HORIGUCHI and Michaël Paquier traded patches to fix a bug where pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs.

Nikita Glukhov sent in another revision of a patch to add functions from SQL/JSON.

Daniel Gustafsson and Michael Banck traded patches to make it possible to enable checksums online.

Peter Eisentraut sent in another revision of a patch to improve handling of heap rewrites in logical decoding.

Kyotaro HORIGUCHI sent in a patch to fix an issue where PQHost() has undefined behavior when the connect string contains both host and hostaddr types.

Michaël Paquier sent in a patch to fix a problem while setting the fpw with SIGHUP.

Pavel Stěhule sent in another revision of a patch to add extra checks to PL/pgsql.

Pavel Stěhule sent in a patch to give PROCEDUREs in PL/pgsql the same parameter naming facilities functions have.

Shubham Barai sent in another revision of a patch to enable predicate locking on GIN indexes.

Peter Eisentraut sent in a patch to enable nested CALL with transactions.

Daniel Gustafsson sent in a patch to make links to external pages in the documents HTTPS where they had been HTTP.

Claudio Freire sent in another revision of a patch to enable VACUUM to use >1GB of work_mem.

Fabien COELHO sent in another revision of a patch to manipulate random seeds in pgbench.

Daniel Gustafsson and Chapman Flack traded patches to zero the headers of unused pages after WAL switch, and add a test to ensure that WAL segment is zeroed out.

Hongyuan Ma sent in a patch against the performance farm code to add apps directory, model classes, and a db_tools directory.

Michael Banck sent in another revision of a patch to verify checksums during basebackups.

Huong Dangminh sent in a patch to fix an issue where it was possible to get a segmentation fault when using GROUPING SETS with all unsortable columns.

Álvaro Herrera sent in another revision of a patch to implement ON CONFLICT DO UPDATE for partitioned tables.

Joe Wildish sent in another revision of a patch to implement SQL ASSERTIONs.

Alexander Kuzmenkov sent in a patch to fix an issue where IndexJoin had a memory problem using SP-GiST and boxes.

Isaac Morland sent in a PoC patch to make more flexible permissions for REFRESH MATERIALIZED VIEW.