Nouvelles hebdomadaires de PostgreSQL

Les nouveautés des produits dérivés

pg_chameleon 2.0.5, un outil pour répliquer de MySQL vers PostgreSQL : https://pypi.python.org/pypi/pg_chameleon
pglogical 2.2, un système de réplication basé sur les WAL logiques, pour PostgreSQL : https://www.2ndquadrant.com/en/resources/pglogical/

Offres d'emplois autour de PostgreSQL en mars

Internationales : http://archives.postgresql.org/pgsql-jobs/2018-03/
Francophones : http://forums.postgresql.fr/viewforum.php?id=4

PostgreSQL Local

PGConf APAC 2018 se tiendra à Singapour du 22 au 24 mars 2018 : http://2018.pgconfapac.org/
La conférence germanophone PostgreSQL Conference 2018 aura lieu le 13 avril 2018 à Berlin : http://2018.pgconf.de/
La PGConfNepal 2018 se tiendra les 4 & 5 mai 2018 à l'université de Katmandou, Dulikhel, Népal : https://postgresconf.org/conferences/Nepal2018
La PGCon 2018 se tiendra à Ottawa du 29 mai au 1er juin 2018 : https://www.pgcon.org/2018/
Le PGDay suisse 2018 aura lieu à Rapperswil-Jona (près de Zurich) le 29 juin 2018. L'appel à conférenciers court du 6 février au 14 avril 2018, et les inscriptions sont ouvertes du 6 février au 28 juin 2018 : http://www.pgday.ch/2018/
La PGConf.Brazil 2018 aura lieu à São Paulo (Brésil) les 3 & 4 août 2018 : http://pgconf.com.br

PostgreSQL dans les média

Planet PostgreSQL : http://planet.postgresql.org/
Planet PostgreSQLFr : http://planete.postgresql.fr/

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20180325213457.GA20038@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Magnus Hagander pushed:

Fix typo in comment. Author: Daniel Gustafsson <daniel@yesql.se> https://git.postgresql.org/pg/commitdiff/71cce90ee99098f52e65278b96662e32ca005771

Robert Haas pushed:

Rewrite recurse_union_children to iterate, rather than recurse. Also, rename it to plan_union_chidren, so the old name wasn't very descriptive. This results in a small net reduction in code, seems at least to me to be easier to understand, and saves space on the process stack. Patch by me, reviewed and tested by Ashutosh Bapat and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CA+TgmoaLRAOqHmMZx=ESM3VDEPceg+-XXZsRXQ8GtFJO_zbMSw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/49525c46309828b3024fe8040fa99c7dcc83933d
Generate a separate upper relation for each stage of setop planning. Commit 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f made setop planning stages return paths rather than plans, but all such paths were loosely associated with a single RelOptInfo, and only the final path was added to the RelOptInfo. Even at the time, it was foreseen that this should be changed, because there is otherwise no good way for a single stage of setop planning to return multiple paths. With this patch, each stage of set operation planning now creates a separate RelOptInfo; these are distinguished by using appropriate relid sets. Note that this patch does nothing whatsoever about actually returning multiple paths for the same set operation; it just makes it possible for a future patch to do so. Along the way, adjust things so that create_upper_paths_hook is called for each of these new RelOptInfos rather than just once, since that might be useful to extensions using that hook. It might be a good to provide an FDW API here as well, but I didn't try to do that for now. Patch by me, reviewed and tested by Ashutosh Bapat and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CA+TgmoaLRAOqHmMZx=ESM3VDEPceg+-XXZsRXQ8GtFJO_zbMSw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/c596fadbfe20ff50a8e5f4bc4b4ff5b7c302ecc0
Defer creation of partially-grouped relation until it's needed. This avoids unnecessarily creating a RelOptInfo for which we have no actual need. This idea is from Ashutosh Bapat, who wrote a very different patch to accomplish a similar goal. It will be more important if and when we get partition-wise aggregate, since then there could be many partially grouped relations all of which could potentially be unnecessary. In passing, this sets the grouping relation's reltarget, which wasn't done previously but makes things simpler for this refactoring. Along the way, adjust things so that add_paths_to_partial_grouping_rel, now renamed create_partial_grouping_paths, does not perform the Gather or Gather Merge steps to generate non-partial paths from partial paths; have the caller do it instead. This is again for the convenience of partition-wise aggregate, which wants to inject additional partial paths are created and before we decide which ones to Gather/Gather Merge. This might seem like a separate change, but it's actually pretty closely entangled; I couldn't really see much value in separating it and having to change some things twice. Patch by me, reviewed by Ashutosh Bapat. Discussion: http://postgr.es/m/CA+TgmoZ+ZJTVad-=vEq393N99KTooxv9k7M+z73qnTAqkb49BQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/4f15e5d09de276fb77326be5567dd9796008ca2e
Determine grouping strategies in create_grouping_paths. Partition-wise aggregate will call create_ordinary_grouping_paths multiple times and we don't want to redo this work every time; have the caller do it instead and pass the details down. Patch by me, reviewed by Ashutosh Bapat. Discussion: http://postgr.es/m/CA+TgmoY7VYYn9a7YHj1nJL6zj6BkHmt4K-un9LRmXkyqRZyynA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/b5996c2791f36a79332e3cb7130e9125a0372730
Don't pass the grouping target around unnecessarily. Since commit 4f15e5d09de276fb77326be5567dd9796008ca2e made grouped_rel set reltarget, a variety of other functions can just get it from grouped_rel instead of having to pass it around explicitly. Simplify accordingly. Patch by me, reviewed by Ashutosh Bapat. Discussion: http://postgr.es/m/CA+TgmoZ+ZJTVad-=vEq393N99KTooxv9k7M+z73qnTAqkb49BQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/94150513ec12c13eb7c98430fc34f477896d38c9
Call pgstat_report_activity() in parallel CREATE INDEX workers. Also set debug_query_string. Oversight in commit 9da0cc35284bdbe8d442d732963303ff0e0a40bc Peter Geoghegan, per a report by Phil Florent. Discussion: https://postgr.es/m/CAH2-Wzmf-34hD4n40uTuE-ZY9P5c%2BmvhFbCdQfN%3DKrKiVm3j3A%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/7de4a1bcc56f494acbd0d6e70781df877dc8ecb5
doc: Update parallel join documentation for Parallel Shared Hash. Thomas Munro Discussion: http://postgr.es/m/CAEepm=3XdL=+bn3=WQVCCT5wwfAEv-4onKpk+XQZdwDXv6etzA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/f644c3b386acc9e1bfef2c4fbe738706d3ccf3a3
Fix typo in comment. Michael Paquier Discussion: http://postgr.es/m/20180205071404.GB17337@paquier.xyz https://git.postgresql.org/pg/commitdiff/8a8c4f3b325ea00cc4ffb106a71e65e79c5d7af9
Avoid creating a TOAST table for a partitioned table. It's useless. Amit Langote Discussion: http://postgr.es/m/b4c9dee6-d134-49b8-79c4-07fbd7c3b898@lab.ntt.co.jp https://git.postgresql.org/pg/commitdiff/2fe6336e2d48d77fca6d0849f03c0faa06725159
Consider Parallel Append of partial paths for UNION [ALL]. Without this patch, we can implement a UNION or UNION ALL as an Append where Gather appears beneath one or more of the Append branches, but this lets us put the Gather node on top, with a partial path for each relation underneath. There is considerably more work that could be done to improve planning in this area, but that will probably need to wait for a future release. Patch by me, reviewed and tested by Ashutosh Bapat and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CA+TgmoaLRAOqHmMZx=ESM3VDEPceg+-XXZsRXQ8GtFJO_zbMSw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/88ba0ae2aa4aaba8ea0d85c0ff81cc46912d9308
Implement partition-wise grouping/aggregation. If the partition keys of input relation are part of the GROUP BY clause, all the rows belonging to a given group come from a single partition. This allows aggregation/grouping over a partitioned relation to be broken down * into aggregation/grouping on each partition. This should be no worse, and often better, than the normal approach. If the GROUP BY clause does not contain all the partition keys, we can still perform partial aggregation for each partition and then finalize aggregation after appending the partial results. This is less certain to be a win, but it's still useful. Jeevan Chalke, Ashutosh Bapat, Robert Haas. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, and Rafia Sabih. Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/e2f1eb0ee30d144628ab523432320f174a2c8966

Álvaro Herrera pushed:

Fix state reversal after partition tuple routing. We make some changes to ModifyTableState and the EState it uses whenever we route tuples to partitions; but we weren't restoring properly in all cases, possibly causing crashes when partitions with different tuple descriptors are targeted by tuples inserted in the same command. Refactor some code, creating ExecPrepareTupleRouting, to encapsulate the needed state changing logic, and have it invoked one level above its current place (ie. put it in ExecModifyTable instead of ExecInsert); this makes it all more readable. Add a test case to exercise this. We don't support having views as partitions; and since only views can have INSTEAD OF triggers, there is no point in testing for INSTEAD OF when processing insertions into a partitioned table. Remove code that appears to support this (but which is actually never relevant.) In passing, fix location of some very confusing comments in ModifyTableState. Reported-by: Amit Langote Author: Etsuro Fujita, Amit Langote Discussion: https://postgr/es/m/0473bf5c-57b1-f1f7-3d58-455c2230bc5f@lab.ntt.co.jp https://git.postgresql.org/pg/commitdiff/6666ee49f49c4a6b008591aea457becffa0df041
Expand comment a little bit. The previous commit removed a comment that was a bit more verbose than its replacement. https://git.postgresql.org/pg/commitdiff/839a8eb2b3df68e105fb4f7a72e71652d6becc7a
Remove unnecessary members from ModifyTableState and ExecInsert. These values can be obtained from the ModifyTable node which is already a part of both the ModifyTableState and ExecInsert. Author: Álvaro Herrera, Amit Langote Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/20180316151303.rml2p5wffn3o6qy6@alvherre.pgsql https://git.postgresql.org/pg/commitdiff/ee0a1fc84eb29c916687dc5bd26909401d3aa8cd
Fix CommandCounterIncrement in partition-related DDL. It makes sense to do the CCIs in the places that do catalog updates, rather than before the places that error out because the former ones fail to do it. In particular, it looks like StorePartitionBound() and IndexSetParentIndex() ought to make their own CCIs. Per review comments from Peter Eisentraut for row-level triggers on partitioned tables. Discussion: https://postgr.es/m/20171229225319.ajltgss2ojkfd3kp@alvherre.pgsql https://git.postgresql.org/pg/commitdiff/4dba331cb3dc1b5ffb0680ed8efae847de216796
Fix relcache handling of the 'default' partition. My commit 4dba331cb3dc that moved around CommandCounterIncrement calls in partitioning DDL code unearthed a problem with the relcache handling for the 'default' partition: the construction of a correct relcache entry for the partitioned table was at the mercy of lack of CCI calls in non-trivial amounts of code. This was prone to creating problems later on, as the code develops. This was visible as a test failure in a compile with RELCACHE_FORCE_RELASE (buildfarm member prion). The problem is that after the mentioned commit it was possible to create a relcache entry that had incomplete information regarding the default partition because I introduced a CCI between adding the catalog entries for the default partition (StorePartitionBound) and the update of pg_partitioned_table entry for its parent partitioned table (update_default_partition_oid). It seems the best fix is to move the latter so that it occurs inside the former; the purposeful lack of intervening CCI should be more obvious, and harder to break. I also remove a check in RelationBuildPartitionDesc that returns NULL if the key is not set. I couldn't find any place that needs this hack anymore; probably it was required because of bugs that have since been fixed. Fix a few typos I noticed while reviewing the code involved. Discussion: https://postgr.es/m/20180320182659.nyzn3vqtjbbtfgwq@alvherre.pgsql https://git.postgresql.org/pg/commitdiff/56163004b8b2151db279744b77138d4d90e2d5cb
Allow FOR EACH ROW triggers on partitioned tables. Previously, FOR EACH ROW triggers were not allowed in partitioned tables. Now we allow AFTER triggers on them, and on trigger creation we cascade to create an identical trigger in each partition. We also clone the triggers to each partition that is created or attached later. This means that deferred unique keys are allowed on partitioned tables, too. Author: Álvaro Herrera Reviewed-by: Peter Eisentraut, Simon Riggs, Amit Langote, Robert Haas, Thomas Munro Discussion: https://postgr.es/m/20171229225319.ajltgss2ojkfd3kp@alvherre.pgsql https://git.postgresql.org/pg/commitdiff/86f575948c773b0ec5b0f27066e37dd93a7f0a96

Tom Lane pushed:

Fix performance hazard in REFRESH MATERIALIZED VIEW CONCURRENTLY. Jeff Janes discovered that commit 7ca25b7de made one of the queries run by REFRESH MATERIALIZED VIEW CONCURRENTLY perform badly. The root cause is bad cardinality estimation for correlated quals, but a principled solution to that problem is some way off, especially since the planner lacks any statistics about whole-row variables. Moreover, in non-error cases this query produces no rows, meaning it must be run to completion; but use of LIMIT 1 encourages the planner to pick a fast-start, slow-completion plan, exactly not what we want. Remove the LIMIT clause, and instead rely on the count parameter we pass to SPI_execute() to prevent excess work if the query does return some rows. While we've heard no field reports of planner misbehavior with this query, it could be that people are having performance issues that haven't reached the level of pain needed to cause a bug report. In any case, that LIMIT clause can't possibly do anything helpful with any existing version of the planner, and it demonstrably can cause bad choices in some cases, so back-patch to 9.4 where the code was introduced. Thomas Munro Discussion: https://postgr.es/m/CAMkU=1z-JoGymHneGHar1cru4F1XDfHqJDzxP_CtK5cL3DOfmg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/6fbd5cce22ebd2203d99cd7dcd179d0e1138599e
Fix some corner-case issues in REFRESH MATERIALIZED VIEW CONCURRENTLY. refresh_by_match_merge() has some issues in the way it builds a SQL query to construct the "diff" table: 1. It doesn't require the selected unique index(es) to be indimmediate. 2. It doesn't pay attention to the particular equality semantics enforced by a given index, but just assumes that they must be those of the column datatype's default btree opclass. 3. It doesn't check that the indexes are btrees. 4. It's insufficiently careful to ensure that the parser will pick the intended operator when parsing the query. (This would have been a security bug before CVE-2018-1058.) 5. It's not careful about indexes on system columns. The way to fix #4 is to make use of the existing code in ri_triggers.c for generating an arbitrary binary operator clause. I chose to move that to ruleutils.c, since that seems a more reasonable place to be exporting such functionality from than ri_triggers.c. While #1, #3, and #5 are just latent given existing feature restrictions, and #2 doesn't arise in the core system for lack of alternate opclasses with different equality behaviors, #4 seems like an issue worth back-patching. That's the bulk of the change anyway, so just back-patch the whole thing to 9.4 where this code was introduced. Discussion: https://postgr.es/m/13836.1521413227@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/6497a18e6c1b5874566a77737ec3d381fded3ec2
Prevent query-lifespan memory leakage of SP-GiST traversal values. The original coding of the SP-GiST scan traversalValue feature (commit ccd6eb49a) arranged for traversal values to be stored in the query's main executor context. That's fine if there's only one index scan per query, but if there are many, we have a memory leak as successive scans create new traversal values. Fix it by creating a separate memory context for traversal values, which we can reset during spgrescan(). Back-patch to 9.6 where this code was introduced. In principle, adding the traversalCxt field to SpGistScanOpaqueData creates an ABI break in the back branches. But I (tgl) have little sympathy for extensions including spgist_private.h, so I'm not very worried about that. Alternatively we could stick the new field at the end of the struct in back branches, but that has its own downsides. Anton Dignös, reviewed by Alexander Kuzmenkov Discussion: https://postgr.es/m/CALNdv1jb6y2Te-m8xHLxLX12RsBmZJ1f4hESX7J0HjgyOhA9eA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/467963c3e9c5ba9a953959f8aebcdd7206188a18
Doc: typo fix, "PG_" should be "TG_" here. Too much PG on the brain in commit 769159fd3, evidently. Noted by marcelhuberfoo@gmail.com. Discussion: https://postgr.es/m/152154834496.11957.17112112802418832865@wrigleys.postgresql.org https://git.postgresql.org/pg/commitdiff/b6cbe9ea1a6e6879926318158d73d430c14aca90
Make configure check for a couple more Perl modules for --enable-tap-tests. Red Hat's notion of a basic Perl installation doesn't include Test::More or Time::HiRes, and reportedly some Debian installs also omit Time::HiRes. Check for those during configure to spare the user the pain of digging through check-world output to find out what went wrong. While we're at it, we should also check the version of Test::More, since TestLib.pm requires at least 0.87. In principle this could be back-patched, but it's probably not necessary. Discussion: https://postgr.es/m/516.1521475003@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/264eb03aab067da6db2a0de907a8421ce6865d60
Change oddly-chosen OID allocation. I noticed while fooling with John Naylor's bootstrap-data patch that we had one high-numbered manually assigned OID, 8888, which evidently came from a submission that the committer didn't bother to bring into line with usual OID allocation practices before committing. That's a bad idea, because it creates a hazard for other patches that may be temporarily using high OID numbers. Change it to something more in line with what we usually do. This evidently dates to commit abb173392. It's too late to change it in released branches, but we can fix it in HEAD. https://git.postgresql.org/pg/commitdiff/27ba260c739e4e10e28688993208c3ffa1b469ab
Improve predtest.c's handling of cases with NULL-constant inputs. Currently, if operator_predicate_proof() is given an operator clause like "something op NULL", it just throws up its hands and reports it can't prove anything. But we can often do better than that, if the operator is strict, because then we know that the clause returns NULL overall. Depending on whether we're trying to prove or refute something, and whether we need weak or strong semantics for NULL, this may be enough to prove the implication, especially when we rely on the standard rule that "false implies anything". In particular, this lets us do something useful with questions like "does X IN (1,3,5,NULL) imply X <= 5?" The null entry in the IN list can effectively be ignored for this purpose, but the proof rules were not previously smart enough to deduce that. This patch is by me, but it owes something to previous work by Amit Langote to try to solve problems of the form mentioned. Thanks also to Emre Hasegeli and Ashutosh Bapat for review. Discussion: https://postgr.es/m/3bad48fc-f257-c445-feeb-8a2b2fb622ba@lab.ntt.co.jp https://git.postgresql.org/pg/commitdiff/0f0deb71948321efc89cf4e3e8cbd9750cc9e566
Fix mishandling of quoted-list GUC values in pg_dump and ruleutils.c. Code that prints out the contents of setconfig or proconfig arrays in SQL format needs to handle GUC_LIST_QUOTE variables differently from other ones, because for those variables, flatten_set_variable_args() already applied a layer of quoting. The value can therefore safely be printed as-is, and indeed must be, or flatten_set_variable_args() will muck it up completely on reload. For all other GUC variables, it's necessary and sufficient to quote the value as a SQL literal. We'd recognized the need for this long ago, but mis-analyzed the need slightly, thinking that all GUC_LIST_INPUT variables needed the special treatment. That's actually wrong, since a valid value of a LIST variable might include characters that need quoting, although no existing variables accept such values. More to the point, we hadn't made any particular effort to keep the various places that deal with this up-to-date with the set of variables that actually need special treatment, meaning that we'd do the wrong thing with, for example, temp_tablespaces values. This affects dumping of SET clauses attached to functions, as well as ALTER DATABASE/ROLE SET commands. In ruleutils.c we can fix it reasonably honestly by exporting a guc.c function that allows discovering the flags for a given GUC variable. But pg_dump doesn't have easy access to that, so continue the old method of having a hard-wired list of affected variable names. At least we can fix it to have just one list not two, and update the list to match current reality. A remaining problem with this is that it only works for built-in GUC variables. pg_dump's list obvious knows nothing of third-party extensions, and even the "ask guc.c" method isn't bulletproof since the relevant extension might not be loaded. There's no obvious solution to that, so for now, we'll just have to discourage extension authors from inventing custom GUCs that need GUC_LIST_QUOTE. This has been busted for a long time, so back-patch to all supported branches. Michael Paquier and Tom Lane, reviewed by Kyotaro Horiguchi and Pavel Stehule Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz https://git.postgresql.org/pg/commitdiff/742869946f4ff121778c2e5923ab51a451b16497
Prevent extensions from creating custom GUCs that are GUC_LIST_QUOTE. Pending some solution for the problems noted in commit 742869946, disallow dynamic creation of GUC_LIST_QUOTE variables. If there are any extensions out there using this feature, they'd not be happy for us to start enforcing this rule in minor releases, so this is a HEAD-only change. The previous commit didn't make things any worse than they already were for such cases. Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz https://git.postgresql.org/pg/commitdiff/846b5a525746b83813771ec4720d664408c47c43
Fix errors in contrib/bloom index build. Count the number of tuples in the index honestly, instead of assuming that it's the same as the number of tuples in the heap. (It might be different if the index is partial.) Fix counting of tuples in current index page, too. This error would have led to failing to write out the final page of the index if it contained exactly one tuple, so that the last tuple of the relation would not get indexed. Back-patch to 9.6 where contrib/bloom was added. Tomas Vondra and Tom Lane Discussion: https://postgr.es/m/3b3d8eac-c709-0d25-088e-b98339a1b28a@2ndquadrant.com https://git.postgresql.org/pg/commitdiff/c35b47286960d2c7885dce162ddfe26939d0d373
Fix tuple counting in SP-GiST index build. Count the number of tuples in the index honestly, instead of assuming that it's the same as the number of tuples in the heap. (It might be different if the index is partial.) Back-patch to all supported versions. Tomas Vondra Discussion: https://postgr.es/m/3b3d8eac-c709-0d25-088e-b98339a1b28a@2ndquadrant.com https://git.postgresql.org/pg/commitdiff/649f1792508fb040a9b70c68dfedd6b93897e087
Sync up our various ways of estimating pg_class.reltuples. VACUUM thought that reltuples represents the total number of tuples in the relation, while ANALYZE counted only live tuples. This can cause "flapping" in the value when background vacuums and analyzes happen separately. The planner's use of reltuples essentially assumes that it's the count of live (visible) tuples, so let's standardize on having it mean live tuples. Another issue is that the definition of "live tuple" isn't totally clear; what should be done with INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples? ANALYZE's choices in this regard are made on the assumption that if the originating transaction commits at all, it will happen after ANALYZE finishes, so we should ignore the effects of the in-progress transaction --- unless it is our own transaction, and then we should count it. Let's propagate this definition into VACUUM, too. Likewise propagate this definition into CREATE INDEX, and into contrib/pgstattuple's pgstattuple_approx() function. Tomas Vondra, reviewed by Haribabu Kommi, some corrections by me Discussion: https://postgr.es/m/16db4468-edfa-830a-f921-39a50498e77e@2ndquadrant.com https://git.postgresql.org/pg/commitdiff/7c91a0364fcf5d739a09cc87e7adb1d4a33ed112
Improve style guideline compliance of assorted error-report messages. Per the project style guide, details and hints should have leading capitalization and end with a period. On the other hand, errcontext should not be capitalized and should not end with a period. To support well formatted error contexts in dblink, extend dblink_res_error() to take a format+arguments rather than a hardcoded string. Daniel Gustafsson Discussion: https://postgr.es/m/B3C002C8-21A0-4F53-A06E-8CAB29FCF295@yesql.se https://git.postgresql.org/pg/commitdiff/feb8254518752b2cb4a8964c374dd82d49ef0e0d
Fix make rules that generate multiple output files. For years, our makefiles have correctly observed that "there is no correct way to write a rule that generates two files". However, what we did is to provide empty rules that "generate" the secondary output files from the primary one, and that's not right either. Depending on the details of the creating process, the primary file might end up timestamped later than one or more secondary files, causing subsequent make runs to consider the secondary file(s) out of date. That's harmless in a plain build, since make will just re-execute the empty rule and nothing happens. But it's fatal in a VPATH build, since make will expect the secondary file to be rebuilt in the build directory. This would manifest as "file not found" failures during VPATH builds from tarballs, if we were ever unlucky enough to ship a tarball with apparently out-of-date secondary files. (It's not clear whether that has ever actually happened, but it definitely could.) To ensure that secondary output files have timestamps >= their primary's, change our makefile convention to be that we provide a "touch $@" action not an empty rule. Also, make sure that this rule actually gets invoked during a distprep run, else the hazard remains. It's been like this a long time, so back-patch to all supported branches. In HEAD, I skipped the changes in src/backend/catalog/Makefile, because those rules are due to get replaced soon in the bootstrap data format patch, and there seems no need to create a merge issue for that patch. If for some reason we fail to land that patch in v11, we'll need to back-fill the changes in that one makefile from v10. Discussion: https://postgr.es/m/18556.1521668179@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/4b538727e2a0e5eae228650c1c145c90471aa521
Mop-up for commit feb8254518752b2cb4a8964c374dd82d49ef0e0d. Missed these occurrences of some of the adjusted error messages. Per buildfarm member pademelon. https://git.postgresql.org/pg/commitdiff/da616950cee395919f835b5cbec3d23c4844015a
Stabilize regression test result. If random() returns a result sufficiently close to zero, float8out switches to scientific notation, breaking this test case's expectation that the output should look like '0.xxxxxxxxx'. Casting to numeric should fix that. Per buildfarm member pogona. Discussion: https://postgr.es/m/20180324212502.wt4serghfidge2on@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/038a2ed1392363a59adeee4e86d848ca74ce39c5
Add #includes missed in commit e22b27f0cb3ee03ee300d431997f5944ccf2d7b3. Leaving out getopt_long.h works on some platforms, but not all. Per buildfarm. Discussion: https://postgr.es/m/20180325030552.f462zqmohs6cqekg@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/2dd3f969f5f2de92182038d1e33b11c798688bc9
Doc: remove extra comma in syntax summary for array_fill(). Noted by Scott Ure. Back-patch to all supported branches. Discussion: https://postgr.es/m/152199346794.4544.1888397173908716912@wrigleys.postgresql.org https://git.postgresql.org/pg/commitdiff/ee4a2c4a0345f2589ce32b64493b1b14e87f0465
Remove useless if-test. Coverity complained that this check is pointless, and it's right. There is no case where we'd call ExecutorStart with a null plannedstmt, and if we did, it'd have crashed before here. Thinko in commit cc415a56d. https://git.postgresql.org/pg/commitdiff/3a2cb59887421a04b5ee158580198d731d115c61

Andrew Dunstan pushed:

Don't use an Msys virtual path to create a tablespace. The new unlogged_reinit recovery tests create a new tablespace using TestLib.pm's tempdir. However, on msys that function returns a virtual path that isn't understood by Postgres. Here we add a new function to TestLib.pm to turn such a path into a real path on the underlying file system, and use it in the new test to create the tablespace. The new function is essentially a NOOP everywhere but msys. https://git.postgresql.org/pg/commitdiff/9ad21a6957ff2d8743e9a59ba062d3c009b24ec4

Peter Eisentraut pushed:

Add missing break. https://git.postgresql.org/pg/commitdiff/13c7c65ec900a30bcddcb27f5fd138dcdbc2ca2e
Attempt to fix build with unusual OpenSSL versions. Since e3bdb2d92600ed45bd46aaf48309a436a9628218, libpq failed to build on some platforms because they did not have SSL_clear_options(). Although mainline OpenSSL introduced SSL_clear_options() after SSL_OP_NO_COMPRESSION, so the code should have built fine, at least an old NetBSD version (build farm "coypu" NetBSD 5.1 gcc 4.1.3 PR-20080704 powerpc) has SSL_OP_NO_COMPRESSION but no SSL_clear_options(). So add a configure check for SSL_clear_options(). If we don't find it, skip the call. That means on such a platform one cannot *enable* SSL compression if the built-in default is off, but that seems an unlikely combination anyway and not very interesting in practice. https://git.postgresql.org/pg/commitdiff/a364dfa4ac7337743050256c6eb17b5db5430173
doc: Small wording improvement. https://git.postgresql.org/pg/commitdiff/d652e3525b8ff988db717ed66c467b6fd78a32bc
Add configure tests for stdbool.h and sizeof bool. This will allow us to assess how many platforms have bool with a size other than 1, which will help us decide how to go forward with using stdbool.h. Discussion: https://www.postgresql.org/message-id/flat/3a0fe7e1-5ed1-414b-9230-53bbc0ed1f49@2ndquadrant.com https://git.postgresql.org/pg/commitdiff/f20b3285340cc0576ab8445f483700983cf2ba9f
Handle heap rewrites even better in logical decoding. Logical decoding should not publish anything about tables created as part of a heap rewrite during DDL. Those tables don't exist externally, so consumers of logical decoding cannot do anything sensible with that information. In ab28feae2bd3d4629bd73ae3548e671c57d785f0, we worked around this for built-in logical replication, but that was hack. This is a more proper fix: We mark such transient heaps using the new field pg_class.relwrite, linking to the original relation OID. By default, we ignore them in logical decoding before they get to the output plugin. Optionally, a plugin can register their interest in getting such changes, if they handle DDL specially, in which case the new field will help them get information about the actual table. Reviewed-by: Craig Ringer <craig@2ndquadrant.com> https://git.postgresql.org/pg/commitdiff/325f2ec5557fd1c9156c910102522e04cb42d99c
pg_controldata: Prevent division-by-zero errors. If the control file is corrupted and specifies the WAL segment size to be 0 bytes, calculating the latest checkpoint's REDO WAL file will fail with a division-by-zero error. Show it as "???" instead. Also reword the warning message a bit and send it to stdout, like the other pre-existing warning messages. Add some tests for dealing with a corrupted pg_control file. Author: Nathan Bossart <bossartn@amazon.com>, tests by me https://git.postgresql.org/pg/commitdiff/4731d848f23e08a9396b4831d13fbb6dd460faf2
Remove stdbool workaround in sepgsql. Since we now use stdbool.h in c.h, this workaround breaks the build and is no longer necessary, so remove it. (Technically, there could be platforms with a 4-byte bool in stdbool.h, in which case we would not include stdbool.h in c.h, and so the old problem that caused this workaround would reappear. But this combination is not known to happen on the range of platforms where sepgsql can be built.) https://git.postgresql.org/pg/commitdiff/5c4920be303e0ab894c9a3a48e780b7e0e56240b
Fix whitespace. https://git.postgresql.org/pg/commitdiff/fdb78948d89b5cc018e3dbf851fafd1652cb5921
Use stdbool.h if suitable. Using the standard bool type provided by C allows some recent compilers and debuggers to give better diagnostics. Also, some extension code and third-party headers are increasingly pulling in stdbool.h, so it's probably saner if everyone uses the same definition. But PostgreSQL code is not prepared to handle bool of a size other than 1, so we keep our own old definition if we encounter a stdbool.h with a bool of a different size. (Among current build farm members, this only applies to old macOS versions on PowerPC.) To check that the used bool is of the right size, add a static assertions about size of GinTernaryValue vs bool. This is currently the only place that assumes that bool and char are of the same size. Discussion: https://www.postgresql.org/message-id/flat/3a0fe7e1-5ed1-414b-9230-53bbc0ed1f49@2ndquadrant.com https://git.postgresql.org/pg/commitdiff/9a95a77d9d5d3003d2d67121f2731b6e5fc37336
pg_resetwal: Add simple test suite. Some subsequent patches will add to this, but to avoid conflicts, set up the basics separately. https://git.postgresql.org/pg/commitdiff/5700aa130186e0b5d600806645b051bfd9067f09
pg_resetwal: Prevent division-by-zero errors. Handle the case where the pg_control file specifies a WAL segment size of 0 bytes. This would previously have led to a division by zero error. Change this to assume the whole file is corrupt and go to guess everything. Discussion: https://www.postgresql.org/message-id/a6163ad7-cc99-fdd1-dfad-25df73032ab8%402ndquadrant.com https://git.postgresql.org/pg/commitdiff/f1a074b146c900bd439b6ef1953866f41b61a669
Further fix interaction of Perl and stdbool.h. In the case that PostgreSQL uses stdbool.h but Perl doesn't, we need to prevent Perl from defining bool, to prevent compiler warnings about redefinition. https://git.postgresql.org/pg/commitdiff/66ee8513d10fb207907d61dd6cf42db7d703af5d
Fix interaction of Perl and stdbool.h. Revert the PL/Perl-specific change in 9a95a77d9d5d3003d2d67121f2731b6e5fc37336. We must not prevent Perl from using stdbool.h when it has been built to do so, even if it uses an incompatible size. Otherwise, we would be imposing our bool on Perl, which will lead to crashes because of the size mismatch. Instead, we undef bool after including the Perl headers, as we did previously, but now only if we are not using stdbool.h ourselves. Record that choice in c.h as USE_STDBOOL. This will also make it easier to apply that coding pattern elsewhere if necessary. https://git.postgresql.org/pg/commitdiff/7ba7986fb4364e889a705c9973fefa138650091c
Small refactoring. Put the "atomic" argument of ExecuteDoStmt() and ExecuteCallStmt() into a variable instead of repeating the formula. https://git.postgresql.org/pg/commitdiff/52f3a9d6a32c0c070a15486c3aecbc4405d2da88
initdb: Improve --wal-segsize handling. Give separate error messages for when the argument is not a number and when it is not the right kind of number. Fix wording in the help message. https://git.postgresql.org/pg/commitdiff/496d56670af44a2a578c15195c36f797e29cff24
Improve pg_resetwal documentation. Clarify that the -l option takes a file name, not an "address", and that that might be different from the LSN if nondefault WAL segment sizes are used. https://git.postgresql.org/pg/commitdiff/4644a1170f0ad88f92d2835f589fffb6aa38c129
Add long options to pg_resetwal and pg_controldata. We were running out of good single-letter options for some upcoming pg_resetwal functionality, so add long options to create more possibilities. Add to pg_controldata as well for symmetry. based on patch by Bossart, Nathan <bossartn@amazon.com> https://git.postgresql.org/pg/commitdiff/e22b27f0cb3ee03ee300d431997f5944ccf2d7b3
pg_resetwal: Fix logical typo in code. introduced in f1a074b146c900bd439b6ef1953866f41b61a669 https://git.postgresql.org/pg/commitdiff/cc547cf08fe62e90f34a780a6b4fe428336ab3ec

Andres Freund pushed:

Add PGAC_PROG_VARCC_VARFLAGS_OPT autoconf macro. The new macro allows to test flags for different compilers and to store them in different CFLAG like variables. The existing PGAC_PROG_CC_CFLAGS_OPT and PGAC_PROG_CC_VAR_OPT are changed to be just wrappers around the new function. This'll be used by the upcoming LLVM support, to separately detect capabilities used by clang, when generating bitcode. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/3de04e4ed12d0794e87e1db2e729d126cf183a58
Add C++ support to configure. This is an optional dependency. It'll be used for the upcoming LLVM based just in time compilation support, which needs to wrap a few LLVM C++ APIs so they're accessible from C.. For now test for C++ compilers unconditionally, without failing if not present, to ensure wide buildfarm coverage. If we're bothered by the additional test times (which are quite short) or verbosity, we can later make the tests conditional on --with-llvm. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/6869b4f2584787d9e4cefaab8a4bae1ecbe63766
Add configure infrastructure (--with-llvm) to enable LLVM support. LLVM will be used for *optional* Just-in-time compilation support. This commit just adds the configure infrastructure that detects LLVM. No documentation has been added for the --with-llvm flag, that'll be added after the actual supporting code has been added. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/5b2526c83832e4e8a9f8db0389904ed2fb50ed37
Handle EEOP_FUNCEXPR_[STRICT_]FUSAGE out of line. This isn't a very common op, and it doesn't seem worth duplicating for JIT. Author: Andres Freund https://git.postgresql.org/pg/commitdiff/4c0000b839e6d4593e63439879b0c2abea14f426
Basic JIT provider and error handling infrastructure. This commit introduces: 1) JIT provider abstraction, which allows JIT functionality to be implemented in separate shared libraries. That's desirable because it allows to install JIT support as a separate package, and because it allows experimentation with different forms of JITing. 2) JITContexts which can be, using functions introduced in follow up commits, used to emit JITed functions, and have them be cleaned up on error. 3) The outline of a LLVM JIT provider, which will be fleshed out in subsequent commits. Documentation for GUCs added, and for JIT in general, will be added in later commits. Author: Andres Freund, with architectural input from Jeff Davis Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/432bb9e04da4d4a1799b1fe7c723b975cb070c43
Fix typo in BITCODE_CXXFLAGS assignment. Typoed-In: 5b2526c83832e Reported-By: Catalin Iacob https://git.postgresql.org/pg/commitdiff/4317cc68a284f041abc583ced4ef7ede2f73fb51
Empty CXXFLAGS inherited from autoconf. We do the same for CFLAGS. This was an omission in 6869b4f25. Reported-By: Catalin Iacob https://git.postgresql.org/pg/commitdiff/a02671cfdeac3bb86ebf8f8577faf69730c4f80e
Add file containing extensions of the LLVM C API. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/31bc604e0b74805ff9e84a2d549ca82be665d0a6
Support for optimizing and emitting code in LLVM JIT provider. This commit introduces the ability to actually generate code using LLVM. In particular, this adds: - Ability to emit code both in heavily optimized and largely unoptimized fashion - Batching facility to allow functions to be defined in small increments, but optimized and emitted in executable form in larger batches (for performance and memory efficiency) - Type and function declaration synchronization between runtime generated code and normal postgres code. This is critical to be able to access struct fields etc. - Developer oriented jit_dump_bitcode GUC, for inspecting / debugging the generated code. - per JitContext statistics of number of functions, time spent generating code, optimizing, and emitting it. This will later be employed for EXPLAIN support. This commit doesn't yet contain any code actually generating functions. That'll follow in later commits. Documentation for GUCs added, and for JIT in general, will be added in later commits. Author: Andres Freund, with contributions by Pierre Ducroquet Testing-By: Thomas Munro, Peter Eisentraut Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/b96d550eb03cfdb000def70912ec840dbe7f67da
Add helpers for emitting LLVM IR. These basically just help to make code a bit more concise and pgindent proof. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/7ec0d80c0508eae35ac8e19d041f9ba1276de08e
Basic planner and executor integration for JIT. This adds simple cost based plan time decision about whether JIT should be performed. jit_above_cost, jit_optimize_above_cost are compared with the total cost of a plan, and if the cost is above them JIT is performed / optimization is performed respectively. For that PlannedStmt and EState have a jitFlags (es_jit_flags) field that stores information about what JIT operations should be performed. EState now also has a new es_jit field, which can store a JitContext. When there are no errors the context is released in standard_ExecutorEnd(). It is likely that the default values for jit_[optimize_]above_cost will need to be adapted further, but in my test these values seem to work reasonably. Author: Andres Freund, with feedback by Peter Eisentraut Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/cc415a56d09a8da7c919088036b6097b70f10791
Add FIELDNO_* macro designating offset into structs required for JIT. For any interesting JIT target, fields inside structs need to be accessed. b96d550e contains infrastructure for syncing the definition of types between postgres C code and runtime code generation with LLVM. But that doesn't sync the number or names of fields inside structs, just the types (including padding etc). One option would be to hardcode the offset numbers in the JIT code, but that'd be hard to keep in sync. Instead add macros indicating the field offset to the fields that need to be accessed. Not pretty, but manageable. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/7ced1d1247286399df53823eb76cacaf6d7fdb22
Expand list of synchronized types and functions in LLVM JIT provider. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/fb46ac26fe493839d6cf3ab8d20bc62a285f7649
Add expression compilation support to LLVM JIT provider. In addition to the interpretation of expressions (which back evaluation of WHERE clauses, target list projection, aggregates transition values etc) support compiling expressions to native code, using the infrastructure added in earlier commits. To avoid duplicating a lot of code, only support emitting code for cases that are likely to be performance critical. For expression steps that aren't deemed that, use the existing interpreter. The generated code isn't great - some architectural changes are required to address that. But this already yields a significant speedup for some analytics queries, particularly with WHERE clauses filtering a lot, or computing multiple aggregates. Author: Andres Freund Tested-By: Thomas Munro Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de Disable JITing for VALUES() nodes. VALUES() nodes are only ever executed once. This is primarily helpful for debugging, when forcing JITing even for cheap queries. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/2a0faed9d7028e3830998bd6ca900be651274e27
Debugging and profiling support for LLVM JIT provider. This currently requires patches to the LLVM codebase to be effective (submitted upstream), the GUCs are available without those patches however. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/250bca7fc145b143d5e9aeeca66f0bb36cf4d5ef
Adapt expression JIT to stdbool.h introduction. The LLVM JIT provider uses clang to synchronize types between normal C code and runtime generated code. Clang represents stdbool.h style booleans in return values & parameters differently from booleans stored in variables. Thus the expression compilation code from 2a0faed9d needs to be adapted to 9a95a77d9. Instead of hardcoding i8 as the type for booleans (which already was wrong on some edge case platforms!), use postgres' notion of a boolean as used for storage and for parameters. Per buildfarm animal xenodermus. Author: Andres Freund https://git.postgresql.org/pg/commitdiff/2111a48a0c5e5198a68cba0c8fb82c4f61be5928

Teodor Sigaev pushed:

Rework word_similarity documentation, make it close to actual algorithm. word_similarity before claimed as returning similarity of closest word in string, but, actually it returns similarity of substring. Also fix mistyped comments. Author: Alexander Korotkov Review by: David Steele, Liudmila Mantrova Discussionis: https://www.postgresql.org/message-id/flat/CY4PR17MB13207ED8310F847CF117EED0D85A0@CY4PR17MB1320.namprd17.prod.outlook.com https://www.postgresql.org/message-id/flat/f43b242d-000c-f4c8-cb8b-d37e9752cd93%40postgrespro.ru https://git.postgresql.org/pg/commitdiff/aea7c17e86e99a7ed4da489b3df2b5493b5e5e95
Add strict_word_similarity to pg_trgm module. strict_word_similarity is similar to existing word_similarity function but it takes into account word boundaries to compute similarity. Author: Alexander Korotkov Review by: David Steele, Liudmila Mantrova, me Discussion: https://www.postgresql.org/message-id/flat/CY4PR17MB13207ED8310F847CF117EED0D85A0@CY4PR17MB1320.namprd17.prod.outlook.com https://git.postgresql.org/pg/commitdiff/be8a7a6866276b228b4ffaa3003e1dc2dd1d140a
UINT64CONST'fy long constants in pgbench. In commit e51a04840a1c45db101686bef0b7025d5014c74b it was missed 64-bit constants, wrap them with UINT64CONST(). Per buildfarm member dromedary and gripe from Tom Lane https://git.postgresql.org/pg/commitdiff/2216fded1ebc9940f3e4c9454cb2f5c937794f1c
Add conditional.c to libpgfeutils for MSVC build. conditional.c was moved in f67b113ac62777d18cd20d3c4d05be964301b936 commit but forgotten to add to Windows build system. I don't have a Windows box, so blind attempt. https://git.postgresql.org/pg/commitdiff/2058d6a22b43a97d1069a51bd95ad56759b3c7bc
Add \if support to pgbench. Patch adds \if to pgbench as it done for psql. Implementation shares condition stack code with psql, so, this code is moved to fe_utils directory. Author: Fabien COELHO with minor editorization by me Review by: Vik Fearing, Fedor Sigaev Discussion: https://www.postgresql.org/message-id/flat/alpine.DEB.2.20.1711252200190.28523@lancre https://git.postgresql.org/pg/commitdiff/f67b113ac62777d18cd20d3c4d05be964301b936
Add general-purpose hashing functions to pgbench. Hashing function is useful for simulating real-world workload in test like WEB workload, as an example - YCSB benchmarks. Author: Ildar Musin with minor editorization by me Reviewed by: Fabien Coelho, me Discussion: https://www.postgresql.org/message-id/flat/0e8bd39e-dfcd-2879-f88f-272799ad7ef2@postgrespro.ru https://git.postgresql.org/pg/commitdiff/e51a04840a1c45db101686bef0b7025d5014c74b
Exclude unlogged tables from base backups. Exclude unlogged tables from base backup entirely except init fork which marks created unlogged table. The next question is do not backup temp table but it's a story for separate patch. Author: David Steele Review by: Adam Brightwell, Masahiko Sawada Discussion: https://www.postgresql.org/message-id/flat/04791bab-cb04-ba43-e9c0-664a4c1ffb2c@pgmasters.net https://git.postgresql.org/pg/commitdiff/8694cc96b52a967a49725f32be7aa77fd3b6ac25

Andrew Gierth pushed:

Repair crash with unsortable grouping sets. If there were multiple grouping sets, none of them empty, all of which were unsortable, then an oversight in consider_groupingsets_paths led to a null pointer dereference. Fix, and add a regression test for this case. Per report from Dang Minh Huong, though I didn't use their patch. Backpatch to 10.x where hashed grouping sets were added. https://git.postgresql.org/pg/commitdiff/d2d79887eadff72c339a072ef693bb6016651d30

Tatsuo Ishii pushed:

Fix typo. Patch by me. https://git.postgresql.org/pg/commitdiff/8bb3c7d347f0c74aa12beeef3599984021323e7d

Dean Rasheed pushed:

Improve ANALYZE's strategy for finding MCVs. Previously, a value was included in the MCV list if its frequency was 25% larger than the estimated average frequency of all nonnull values in the table. For uniform distributions, that can lead to values being included in the MCV list and significantly overestimated on the basis of relatively few (sometimes just 2) instances being seen in the sample. For non-uniform distributions, it can lead to too few values being included in the MCV list, since the overall average frequency may be dominated by a small number of very common values, while the remaining values may still have a large spread of frequencies, causing both substantial overestimation and underestimation of the remaining values. Furthermore, increasing the statistics target may have little effect because the overall average frequency will remain relatively unchanged. Instead, populate the MCV list with the largest set of common values that are statistically significantly more common than the average frequency of the remaining values. This takes into account the variance of the sample counts, which depends on the counts themselves and on the proportion of the table that was sampled. As a result, it constrains the relative standard error of estimates based on the frequencies of values in the list, reducing the chances of too many values being included. At the same time, it allows more values to be included, since the MCVs need only be more common than the remaining non-MCVs, rather than the overall average. Thus it tends to produce fewer MCVs than the previous code for uniform distributions, and more for non-uniform distributions, reducing estimation errors in both cases. In addition, the algorithm responds better to increasing the statistics target, allowing more values to be included in the MCV list when more of the table is sampled. Jeff Janes, substantially modified by me. Reviewed by John Naylor and Tomas Vondra. Discussion: https://postgr.es/m/CAMkU=1yvdGvW9TmiLAhz2erFnvnPFYHbOZuO+a=4DVkzpuQ2tw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/b5db1d93d2a6e2d3186f8798a5d06e07b7536a1d

Noah Misch pushed:

Don't qualify type pg_catalog.text in extend-extensions-example. Extension scripts begin execution with pg_catalog at the front of the search path, so type names reliably refer to pg_catalog. Remove these superfluous qualifications. Earlier <programlisting> of this <sect1> already omitted them. Back-patch to 9.3 (all supported versions). https://git.postgresql.org/pg/commitdiff/c92f7c62232c67b1a35ca5524a41a5cddfe66746

Correctifs en attente

Tom Lane sent in a patch to fix some issues in matview.c's refresh-query construction.

Daniel Gustafsson and Magnus Hagander traded patches to make it possible to enable checksums online.

Tomas Vondra sent in another revision of a patch to implement multivariate histograms and MCV lists.

Chapman Flack sent in another revision of a patch to zero headers of unused pages after WAL switch and add a test for ensuring WAL segment is zeroed out.

Amul Sul sent in another revision of a patch to restrict concurrent update/delete with UPDATE of partition key.

Kyotaro HORIGUCHI sent in another revision of a patch to restrict maximum keep segments by repslots.

Andrey Borodin sent in another revision of a patch to add SLRU checksums.

Masahiko Sawada sent in another revision of a patch to qualify datatype name in log of data type conversion on subscriber and add a test module for same.

Masahiko Sawada sent in another revision of a patch to add a vacuum_cleanup_index_scale_factor GUC.

Etsuro Fujita, Amit Langote, and Álvaro Herrera traded patches to make ON CONFLICT .. DO UPDATE work on partitioned tables.

Amit Langote and David Rowley traded patches to lay infrastructure for speeding up partition pruning.

Artur Zakirov and Tomas Vondra traded patches to implement shared ISpell dictionaries.

Pavel Stěhule sent in two more revisions of a patch to add extra checks to PL/pgsql.

Thomas Munro sent in a patch to add docs to the top-level Makefile for non-GNU make.

Pavel Stěhule sent in a patch to enable procedures to be called with default arguments in PL/pgsql.

Alexander Korotkov sent in two more revisions of a patch to implement incremental sort.

Julian Markwort sent in two more revisions of a patch to add a plan option to pg_stat_statements.

Pavel Stěhule sent in two more revisions of a patch to implement schema variables.

Dilip Kumar sent in two more revisions of a patch to ensure that InitXLogInsert is never called in a critical section.

Peter Eisentraut sent in another revision of a patch to use file cloning in pg_upgrade and CREATE DATABASE.

Pavan Deolasee, Peter Geoghegan, and Amit Langote traded patches to implement MERGE.

Daniel Gustafsson sent in another revision of a patch to support sending an optional message in backend cancel/terminate.

Tomas Vondra sent in another revision of a patch to implement BRIN multi-range indexes and BRIN Bloom indexes.

Michael Banck sent in a patch to allow setting replication slots in recovery.conf even if wal-method is none.

Daniel Gustafsson sent in a patch to pg_basebackup to add missing newlines in some error messages.

Konstantin Knizhnik sent in another revision of a patch to optimize secondary indexes.

Thomas Munro sent in a patch to tweak the JIT docs.

Doug Rady sent in another revision of a patch to enable building pgbench using ppoll() for larger connection counts.

Tom Lane and John Naylor traded patches to rationalize the way bootstrap data is handled.

Pavel Stěhule sent in a patch to enable CALL with named default arguments in PL/pgsql.

Nathan Bossart sent in a patch to combine options for RangeVarGetRelidExtended() into a flags argument and add a skip-locked option to same.

Fabien COELHO sent in a patch to fix some constants in the new general-purpose hashing functions for pgbench.

Amit Langote sent in another revision of a patch to refactor the partitioning code.

Dmitry Dolgov sent in another revision of a patch to implement generic type subscripting and use same for arrays and JSONB.

Teodor Sigaev sent in another revision of a patch to add a prefix operator for text with SP-GiST support.

David Rowley sent in two more revisions of a patch to remove useless DISTINCT clauses.

Fabien COELHO sent in two more revisions of a patch to implement --random-seed for pgbench.

David Steele and Michael Banck traded patches to verify checksums during basebackups.

Pavan Deolasee sent in two more revisions of a patch to speed up inserts with mostly-monotonically increasing values.

Daniel Vérité and Pavel Stěhule traded patches to add a csv format to psql.

Tomas Vondra sent in a patch to fix a a minor mistake in brin_inclusion.c comment.

Ashutosh Bapat sent in a patch to fix a comment in BuildTupleFromCStrings().

Teodor Sigaev sent in another revision of a patch to implement predicate locking on GIN indexes.

Ashutosh Bapat sent in a patch to fix an issue with procedure name resolution.

Julian Markwort sent in another revision of a patch to add a new auth option to pg_hba.conf: clientcert=verify-full.

Simon Riggs sent in another revision of a patch to implement logical decoding of two-phase transactions.

David Steele sent in a patch to add more TAP tests to pgrewind.

Pavan Deolasee sent in another revision of a patch to change the WAL header to reduce contention during ReserveXLogInsertLocation().

Haribabu Kommi sent in two more revisions of a patch to make PQhost return connected host and hostaddr details.

Fabien COELHO sent in a patch to pgbench to test whether a variable exists.

Robert Haas sent in another revision of a patch to enable parallel seq scan for slow functions.

Peter Eisentraut sent in another revision of a patch to enable nested CALL with transactions in PL/pgSQL.

Fabien COELHO sent in another revision of a patch to pgbench to enable it to store SELECT results into variables.

Tom Lane sent in a patch to help with backend memory dump analysis by adding context identifiers.

David Rowley sent in two more revisions of a patch to speed up execution of ALTER TABLE ... ADD COLUMN ... DEFAULT [not null].

Michaël Paquier sent in a patch to simplify the final sync in pg_rewind's target folder and add --no-sync.

Takayuki Tsunakawa sent in another revision of a patch to fix a problem in ECPG where freeing memory for pgtypes crashes on Windows.

Actualité de PostgreSQL

Nouvelles hebdomadaires de PostgreSQL - 25 mars 2018

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en mars

PostgreSQL Local

PostgreSQL dans les média

Correctifs appliqués

Correctifs en attente

PostgreSQL.fr

À propos

Nous contacter

Groupes Locaux

Réseaux Sociaux