Alvaro Herrera [Fri, 15 Jan 2021 13:31:42 +0000 (10:31 -0300)] Avoid spurious wait in concurrent reindex
This is like commit
c98763bf51bf, but for REINDEX CONCURRENTLY. To wit:
this flags indicates that the current process is safe to ignore for the
purposes of waiting for other snapshots, when doing CREATE INDEX
CONCURRENTLY and REINDEX CONCURRENTLY. This helps two processes doing
either of those things not deadlock, and also avoids spurious waits.
Author: Álvaro Herrera <
[email protected]>
Reviewed-by: Dmitry Dolgov <[email protected]>Reviewed-by: Hamid Akhtar <[email protected]>Reviewed-by: Masahiko Sawada <[email protected]>Discussion: https://postgr.es/m/
20201130195439[email protected]Fujii Masao [Fri, 15 Jan 2021 03:44:17 +0000 (12:44 +0900)] Fix calculation of how much shared memory is required to store a TOC.
Commit
ac883ac453 refactored shm_toc_estimate() but changed its calculation
of shared memory size for TOC incorrectly. Previously this could cause too
large memory to be allocated.
Back- to v11 where the bug was introduced.
Author: Takayuki Tsunakawa
Discussion: https://postgr.es/m/TYAPR01MB2990BFB73170E2C4921E2C4DFEA80@TYAPR01MB2990.jpnprd01.prod.outlook.com
Michael Paquier [Fri, 15 Jan 2021 02:46:34 +0000 (11:46 +0900)] Remove PG_SHA*_DIGEST_STRING_LENGTH from sha2.h
The last reference to those variables has been removed in
aef8948, so
this cleans up a bit the code.
Discussion: https://postgr.es/m/X//
[email protected]Michael Paquier [Fri, 15 Jan 2021 01:33:13 +0000 (10:33 +0900)] Fix O(N^2) stat() calls when recycling WAL segments
The counter tracking the last segment number recycled was getting
initialized when recycling one single segment, while it should be used
across a full cycle of segments recycled to prevent useless checks
related to entries already recycled.
This performance issue has been introduced by
b2a5545, and it was first
implemented in
61b86142.
No back is done per the lack of field complaints.
Reported-by: Andres Freund, Thomas MunroAuthor: Michael Paquier
Reviewed-By: Andres FreundDiscussion: https://postgr.es/m/
20170621211016[email protected]Discussion: https://postgr.es/m/CA+hUKG+DRiF9z1_MU4fWq+RfJMxP7zjoptfcmuCFPeO4JM2iVg@mail.gmail.com
Fujii Masao [Fri, 15 Jan 2021 01:30:19 +0000 (10:30 +0900)] postgres_fdw: Save foreign server OID in connection cache entry.
The foreign server OID stored in the connection cache entry is used as
a lookup key to directly get the server name.
Previously since the connection cache entry did not have the server OID,
postgres_fdw had to get the server OID at first from user mapping before
getting the server name. So if the corresponding user mapping was dropped,
postgres_fdw could raise the error "cache lookup failed for user mapping"
while looking up user mapping and fail to get the server name even though
the server had not been dropped yet.
Author: Bharath Rupireddy
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CALj2ACVRZPUB7ZwqLn-6DY8C_UmPs6084gSpHA92YBv++1AJXA@mail.gmail.com
Tom Lane [Thu, 14 Jan 2021 21:19:38 +0000 (16:19 -0500)] pg_dump: label PUBLICATION TABLE ArchiveEntries with an owner.
This is the same fix as commit
9eabfe300 applied to INDEX ATTACH
entries, but for table-to-publication attachments. As in that
case, even though the backend doesn't record "ownership" of the
attachment, we still ought to label it in the dump archive with
the role name that should run the ALTER PUBLICATION command.
The existing behavior causes the ALTER to be done by the original
role that started the restore; that will usually work fine, but
there may be corner cases where it fails.
The bulk of the is concerned with changing struct
PublicationRelInfo to include a pointer to the associated
PublicationInfo object, so that we can get the owner's name
out of that when the time comes. While at it, I rewrote
getPublicationTables() to do just one query of pg_publication_rel,
not one per table.
Back- to v10 where this code was introduced.
Discussion: https://postgr.es/m/
1165710.
1610473242@sss.pgh.pa.us
Alvaro Herrera [Thu, 14 Jan 2021 18:32:14 +0000 (15:32 -0300)] Prevent drop of tablespaces used by partitioned relations
When a tablespace is used in a partitioned relation (per commits
ca4103025dfe in pg12 for tables and
33e6c34c3267 in pg11 for indexes),
it is possible to drop the tablespace, potentially causing various
problems. One such was reported in bug #16577, where a rewriting ALTER
TABLE causes a server crash.
Protect against this by using pg_shdepend to keep track of tablespaces
when used for relations that don't keep physical files; we now abort a
tablespace if we see that the tablespace is referenced from any
partitioned relations.
Back this to 11, where this problem has been latent all along. We
don't try to create pg_shdepend entries for existing partitioned
indexes/tables, but any ones that are modified going forward will be
protected.
Note slight behavior change: when trying to drop a tablespace that
contains both regular tables as well as partitioned ones, you'd
previously get ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE and now you'll
get ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST. Arguably, the latter is more
correct.
It is possible to add protecting pg_shdepend entries for existing
tables/indexes, by doing
ALTER TABLE ONLY some_partitioned_table SET TABLESPACE pg_default;
ALTER TABLE ONLY some_partitioned_table SET TABLESPACE original_tablespace;
for each partitioned table/index that is not in the database default
tablespace. Because these partitioned objects do not have storage, no
file needs to be actually moved, so it shouldn't take more time than
what's required to acquire locks.
This query can be used to search for such relations:
SELECT ... FROM pg_class WHERE relkind IN ('p', 'I') AND reltablespace <> 0
Reported-by: Alexander Lakhin <[email protected]>Discussion: https://postgr.es/m/16577-
881633a9f9894fd5@postgresql.org
Author: Álvaro Herrera <
[email protected]>
Reviewed-by: Michael Paquier <[email protected]>Fujii Masao [Thu, 14 Jan 2021 05:37:01 +0000 (14:37 +0900)] Stabilize timeline switch regression test.
Commit
fef5b47f6b added the regression test to check whether a standby is
able to follow a primary on a newer timeline when WAL archiving is enabled.
But the buildfarm member florican reported that this test failed because
the requested WAL segment was removed and replication failed. This is a
timing issue. Since neither replication slot is used nor wal_keep_size is set
in the test, checkpoint could remove the WAL segment that's still necessary
for replication.
This commit stabilizes the test by setting wal_keep_size.
Back- to v13 where the regression test that this commit stabilizes
was added.
Author: Fujii Masao
Discussion: https://postgr.es/m/X//
[email protected]Fujii Masao [Thu, 14 Jan 2021 06:41:22 +0000 (15:41 +0900)] Improve tab-completion for CLOSE, DECLARE, FETCH and MOVE.
This commit makes CLOSE, FETCH and MOVE commands tab-complete the list of
cursors. Also this commit makes DECLARE command tab-complete the options.
Author: Shinya Kato, Sawada Masahiko, tweaked by Fujii Masao
Reviewed-by: Shinya Kato, Sawada Masahiko, Fujii MasaoDiscussion: https://postgr.es/m/
b0e4c5c53ef84c5395524f5056fc71f0@MP-MSGSS-MBX001.msg.nttdata.co.jp
Thomas Munro [Thu, 14 Jan 2021 05:09:32 +0000 (18:09 +1300)] Minor header cleanup for the new iovec code.
Remove redundant function declaration and improve header comment in
pg_iovec.h. Move the new declaration in fd.h next to a group of more
similar functions.
Fujii Masao [Thu, 14 Jan 2021 03:27:11 +0000 (12:27 +0900)] Ensure that a standby is able to follow a primary on a newer timeline.
Commit
709d003fbd refactored WAL-reading code, but accidentally caused
WalSndSegmentOpen() to fail to follow a timeline switch while reading from
a historic timeline. This issue caused a standby to fail to follow a primary
on a newer timeline when WAL archiving is enabled.
If there is a timeline switch within the segment, WalSndSegmentOpen() should
read from the WAL segment belonging to the new timeline. But previously
since it failed to follow a timeline switch, it tried to read the WAL segment
with old timeline. When WAL archiving is enabled, that WAL segment with
old timeline doesn't exist because it's renamed to .partial. This leads
a primary to have tried to read non-existent WAL segment, and which caused
replication to faill with the error "ERROR: requested WAL segment ... has
already been removed".
This commit fixes WalSndSegmentOpen() so that it's able to follow a timeline
switch, to ensure that a standby is able to follow a primary on a newer
timeline even when WAL archiving is enabled.
This commit also adds the regression test to check whether a standby is
able to follow a primary on a newer timeline when WAL archiving is enabled.
Back- to v13 where the bug was introduced.
Reported-by: Kyotaro HoriguchiAuthor: Kyotaro Horiguchi, tweaked by Fujii Masao
Reviewed-by: Alvaro Herrera, Fujii MasaoDiscussion: https://postgr.es/m/
20201209.174314.
282492377848029776[email protected]Michael Paquier [Thu, 14 Jan 2021 02:13:24 +0000 (11:13 +0900)] Rework refactoring of hex and encoding routines
This commit addresses some issues with
c3826f83 that moved the hex
decoding routine to src/common/:
- The decoding function lacked overflow checks, so when used for
security-related features it was an open door to out-of-bound writes if
not carefully used that could remain undetected. Like the base64
routines already in src/common/ used by SCRAM, this routine is reworked
to check for overflows by having the size of the destination buffer
passed as argument, with overflows checked before doing any writes.
- The encoding routine was missing. This is moved to src/common/ and
it gains the same overflow checks as the decoding part.
On failure, the hex routines of src/common/ issue an error as per the
discussion done to make them usable by frontend tools, but not by shared
libraries. Note that this is why ECPG is left out of this commit, and
it still includes a duplicated logic doing hex encoding and decoding.
While on it, this commit uses better variable names for the source and
destination buffers in the existing escape and base64 routines in
encode.c and it makes them more robust to overflow detection. The
previous core code issued a FATAL after doing out-of-bound writes if
going through the SQL functions, which would be enough to detect
problems when working on changes that impacted this area of the
code. Instead, an error is issued before doing an out-of-bound write.
The hex routines were being directly called for bytea conversions and
backup manifests without such sanity checks. The current calls happen
to not have any problems, but careless uses of such APIs could easily
lead to CVE-class bugs.
Author: Bruce Momjian, Michael Paquier
Reviewed-by: Sehrope SarkuniDiscussion: https://postgr.es/m/
20201231003557[email protected]Thomas Munro [Wed, 13 Jan 2021 22:10:24 +0000 (11:10 +1300)] Move our p{read,write}v replacements into their own files.
macOS's ranlib issued a warning about an empty pread.o file with the
previous arrangement, on systems new enough to require no replacement
functions. Let's go back to using configure's AC_REPLACE_FUNCS system
to build and include each .o in the library only if it's needed, which
requires moving the *v() functions to their own files.
Also move the _with_retry() wrapper to a more permanent home.
Reported-by: Tom Lane <[email protected]>Discussion: https://postgr.es/m/
1283127.
1610554395%40sss.pgh.pa.us
Tom Lane [Wed, 13 Jan 2021 21:23:15 +0000 (16:23 -0500)] Mark inet_server_addr() and inet_server_port() as parallel-restricted.
These need to be PR because they access the MyProcPort data structure,
which doesn't get copied to parallel workers. The very similar
functions inet_client_addr() and inet_client_port() are already
marked PR, but somebody missed these.
Although this is a pre-existing bug, we can't readily fix it in the back
branches since we can't force initdb. Given the small usage of these
two functions, and the even smaller likelihood that they'd get pushed to
a parallel worker anyway, it doesn't seem worth the trouble to suggest
that DBAs should fix it manually.
Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoAT4aHP0Uxq91qpD7NL009tnUYQe-b14R3MnSVOjtE71g@mail.gmail.com
Tom Lane [Wed, 13 Jan 2021 21:14:38 +0000 (16:14 -0500)] Run reformat-dat-files to declutter the catalog data files.
Things had gotten pretty messy here, apparently mostly but not
entirely the fault of the multirange . No functional changes.
Tom Lane [Wed, 13 Jan 2021 20:59:57 +0000 (15:59 -0500)] Doc, more or less: uncomment tutorial example that was fixed long ago.
Reverts a portion of commit
344190b7e. Apparently, back in the
twentieth century we had some issues with multi-statement SQL
functions, but they've worked fine for a long time.
Daniel Westermann
Discussion: https://postgr.es/m/GVAP278MB04242DCBF5E31F528D53FA18D2A90@GVAP278MB0424.CHEP278.PROD.OUTLOOK.COM
Alvaro Herrera [Wed, 13 Jan 2021 20:55:41 +0000 (17:55 -0300)] Call out vacuum considerations in create index docs
Back to pg12, which is as far as it goes without conflicts.
Author: James Coleman <
[email protected]>
Reviewed-by: "David G. Johnston" <[email protected]>Discussion: https://postgr.es/m/CAAaqYe9oEfbz7AxXq7OX+FFVi5w5p1e_Of8ON8ZnKO9QqBfmjg@mail.gmail.com
Tom Lane [Wed, 13 Jan 2021 19:52:49 +0000 (14:52 -0500)] Disallow a digit as the first character of a variable name in pgbench.
The point of this restriction is to avoid trying to substitute variables
into timestamp literal values, which may contain strings like '12:34'.
There is a good deal more that should be done to reduce pgbench's
tendency to substitute where it shouldn't. But this is sufficient to
solve the case complained of by Jaime Soler, and it's simple enough
to back-.
Back- to v11; before commit
9d36a3866, pgbench had a slightly
different definition of what a variable name is, and anyway it seems
unwise to change long-stable branches for this.
Fabien Coelho
Discussion: https://postgr.es/m/alpine.DEB.2.22.394.
2006291740420.805678@pseudo
Heikki Linnakangas [Wed, 13 Jan 2021 18:57:21 +0000 (20:57 +0200)] Fix test failure with wal_level=minimal.
The newly-added gist pageinspect test prints the LSNs of GiST pages,
expecting them all to be 1 (GistBuildLSN). But with wal_level=minimal,
they got updated by the whole-relation WAL-logging at commit. Fix by
wrapping the problematic tests in the same transaction with the CREATE
INDEX.
Per buildfarm failure on thorntail.
Discussion: https://www.postgresql.org/message-id/
3B4F97E5-40FB-4142-8CAA-
B301CDFBF982%40iki.fi
Tom Lane [Wed, 13 Jan 2021 18:30:04 +0000 (13:30 -0500)] Doc: clarify behavior of back-half options in pg_dump.
Options that change how the archive data is converted to SQL text
are ignored when dumping to archive formats. The documentation
previously said "not meaningful", which is not helpful.
Discussion: https://postgr.es/m/
161052021249.12228.
9598689907884726185@wrigleys.postgresql.org
Peter Geoghegan [Wed, 13 Jan 2021 17:21:32 +0000 (09:21 -0800)] Enhance nbtree index tuple deletion.
Teach nbtree and heapam to cooperate in order to eagerly remove
duplicate tuples representing dead MVCC versions. This is "bottom-up
deletion". Each bottom-up deletion pass is triggered lazily in response
to a flood of versions on an nbtree leaf page. This usually involves a
"logically unchanged index" hint (these are produced by the executor
mechanism added by commit
9dc718bd).
The immediate goal of bottom-up index deletion is to avoid "unnecessary"
page splits caused entirely by version duplicates. It naturally has an
even more useful effect, though: it acts as a backstop against
accumulating an excessive number of index tuple versions for any given
_logical row_. Bottom-up index deletion complements what we might now
call "top-down index deletion": index vacuuming performed by VACUUM.
Bottom-up index deletion responds to the immediate local needs of
queries, while leaving it up to autovacuum to perform infrequent clean
sweeps of the index. The overall effect is to avoid certain
pathological performance issues related to "version churn" from UPDATEs.
The previous tableam interface used by index AMs to perform tuple
deletion (the table_compute_xid_horizon_for_tuples() function) has been
replaced with a new interface that supports certain new requirements.
Many (perhaps all) of the capabilities added to nbtree by this commit
could also be extended to other index AMs. That is left as work for a
later commit.
Extend deletion of LP_DEAD-marked index tuples in nbtree by adding logic
to consider extra index tuples (that are not LP_DEAD-marked) for
deletion in passing. This increases the number of index tuples deleted
significantly in many cases. The LP_DEAD deletion process (which is now
called "simple deletion" to clearly distinguish it from bottom-up
deletion) won't usually need to visit any extra table blocks to check
these extra tuples. We have to visit the same table blocks anyway to
generate a latestRemovedXid value (at least in the common case where the
index deletion operation's WAL record needs such a value).
Testing has shown that the "extra tuples" simple deletion enhancement
increases the number of index tuples deleted with almost any workload
that has LP_DEAD bits set in leaf pages. That is, it almost never fails
to delete at least a few extra index tuples. It helps most of all in
cases that happen to naturally have a lot of delete-safe tuples. It's
not uncommon for an individual deletion operation to end up deleting an
order of magnitude more index tuples compared to the old naive approach
(e.g., custom instrumentation of the shows that this happens
fairly often when the regression tests are run).
Add a further enhancement that augments simple deletion and bottom-up
deletion in indexes that make use of deduplication: Teach nbtree's
_bt_delitems_delete() function to support granular TID deletion in
posting list tuples. It is now possible to delete individual TIDs from
posting list tuples provided the TIDs have a tableam block number of a
table block that gets visited as part of the deletion process (visiting
the table block can be triggered directly or indirectly). Setting the
LP_DEAD bit of a posting list tuple is still an all-or-nothing thing,
but that matters much less now that deletion only needs to start out
with the right _general_ idea about which index tuples are deletable.
Bump XLOG_PAGE_MAGIC because xl_btree_delete changed.
No bump in BTREE_VERSION, since there are no changes to the on-disk
representation of nbtree indexes. Indexes built on PostgreSQL 12 or
PostgreSQL 13 will automatically benefit from bottom-up index deletion
(i.e. no reindexing required) following a pg_upgrade. The enhancement
to simple deletion is available with all B-Tree indexes following a
pg_upgrade, no matter what PostgreSQL version the user upgrades from.
Author: Peter Geoghegan <
[email protected]>
Reviewed-By: Heikki Linnakangas <[email protected]>Reviewed-By: Victor Yegorov <[email protected]>Discussion: https://postgr.es/m/CAH2-Wzm+maE3apHB8NOtmM=p-DO65j2V5GzAWCOEEuy3JZgb2g@mail.gmail.com
Peter Geoghegan [Wed, 13 Jan 2021 16:11:00 +0000 (08:11 -0800)] Pass down "logically unchanged index" hint.
Add an executor aminsert() hint mechanism that informs index AMs that
the incoming index tuple (the tuple that accompanies the hint) is not
being inserted by execution of an SQL statement that logically modifies
any of the index's key columns.
The hint is received by indexes when an UPDATE takes place that does not
apply an optimization like heapam's HOT (though only for indexes where
all key columns are logically unchanged). Any index tuple that receives
the hint on insert is expected to be a duplicate of at least one
existing older version that is needed for the same logical row. Related
versions will typically be stored on the same index page, at least
within index AMs that apply the hint.
Recognizing the difference between MVCC version churn duplicates and
true logical row duplicates at the index AM level can help with cleanup
of garbage index tuples. Cleanup can intelligently target tuples that
are likely to be garbage, without wasting too many cycles on less
promising tuples/pages (index pages with little or no version churn).
This is infrastructure for an upcoming commit that will teach nbtree to
perform bottom-up index deletion. No index AM actually applies the hint
just yet.
Author: Peter Geoghegan <
[email protected]>
Reviewed-By: Victor Yegorov <[email protected]>Discussion: https://postgr.es/m/CAH2-Wz=CEKFa74EScx_hFVshCOn6AA5T-ajFASTdzipdkLTNQQ@mail.gmail.com
Fujii Masao [Wed, 13 Jan 2021 13:59:17 +0000 (22:59 +0900)] Log long wait time on recovery conflict when it's resolved.
This is a follow-up of the work done in commit
0650ff2303. This commit
extends log_recovery_conflict_waits so that a log message is produced
also when recovery conflict has already been resolved after deadlock_timeout
passes, i.e., when the startup process finishes waiting for recovery
conflict after deadlock_timeout. This is useful in investigating how long
recovery conflicts prevented the recovery from applying WAL.
Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Bertrand DrouvotDiscussion: https://postgr.es/m/
9a60178c-a853-1440-2cdc-
c3af916cff59@amazon.com
Heikki Linnakangas [Wed, 13 Jan 2021 10:32:54 +0000 (12:32 +0200)] Fix portability issues in the new gist pageinspect test.
1. The raw bytea representation of the point-type keys used in the test
depends on endianess. Remove the raw key_data column from the test.
2. The items stored on non-leftmost gist page depends on how many items
git on the other pages. This showed up as a failure on 32-bit i386
systems. To fix, only test the gist_page_items() function on the
leftmost leaf page.
Per Andrey Borodin and the buildfarm.
Discussion: https://www.postgresql.org/message-id/
9FCEC1DC-86FB-4A57-88EF-
DD13663B36AF%40yandex-team.ru
Magnus Hagander [Wed, 13 Jan 2021 10:07:37 +0000 (11:07 +0100)] Remove incorrect markup
Seems
737d69ffc3c made a copy/paste or automation error resulting in two
extra right-parenthesis.
Reported-By: Michael VastolaBack-through: 13
Discussion: https://postgr.es/m/
161051035421.12224.
1741822783166533529@wrigleys.postgresql.org
Heikki Linnakangas [Wed, 13 Jan 2021 08:33:33 +0000 (10:33 +0200)] Add functions to 'pageinspect' to inspect GiST indexes.
Author: Andrey Borodin and me
Discussion: https://www.postgresql.org/message-id/
3E4F9093-A1B5-4DF8-A292-
0B48692E3954%40yandex-team.ru
Thomas Munro [Wed, 13 Jan 2021 06:11:09 +0000 (19:11 +1300)] Don't use elog() in src/port/pwrite.c.
Nothing broke because of this oversight yet, but it would fail to link
if we tried to use pg_pwrite() in frontend code on a system that lacks
pwrite(). Use an assertion instead. Also pgindent while here.
Discussion: https://postgr.es/m/CA%2BhUKGL57RvoQsS35TVPnQoPYqbtBixsdRhynB8NpcUKpHTTtg%40mail.gmail.com
Amit Kapila [Wed, 13 Jan 2021 02:49:50 +0000 (08:19 +0530)] Fix memory in SnapBuildSerialize.
The memory for the snapshot was while serializing it to disk during
logical decoding. This memory will be freed only once walsender stops
the changes. This can lead to a huge memory increase when master
logs Standby Snapshot too frequently say when the user is trying to create
many replication slots.
Reported-by: [email protected]Diagnosed-by: [email protected]Author: Amit Kapila
Back-through: 9.5
Discussion: https://postgr.es/m/
033ab54c-6393-42ee-8ec9-
2b399b5d8cde[email protected]Amit Kapila [Wed, 13 Jan 2021 02:16:11 +0000 (07:46 +0530)] Optimize DropRelFileNodesAllBuffers() for recovery.
Similar to commit
d6ad34f341, this optimizes
DropRelFileNodesAllBuffers() by avoiding the complete buffer pool scan and
instead find the buffers to be invalidated by doing lookups in the
BufMapping table.
This optimization helps operations where the relation files need to be
removed like Truncate, Drop, Abort of Create Table, etc.
Author: Kirk Jamison
Reviewed-by: Kyotaro Horiguchi, Takayuki Tsunakawa, and Amit KapilaTested-By: Haiying TangDiscussion: https://postgr.es/m/OSBPR01MB3207DCA7EC725FDD661B3EDAEF660@OSBPR01MB3207.jpnprd01.prod.outlook.com
Michael Paquier [Wed, 13 Jan 2021 01:32:21 +0000 (10:32 +0900)] Fix routine name in comment of catcache.c
Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACUDXLAkf_XxQO9tAUtnTNGi3Lmd8fANd+vBJbcHn1HoWA@mail.gmail.com
Alvaro Herrera [Tue, 12 Jan 2021 20:04:49 +0000 (17:04 -0300)] Invent struct ReindexIndexInfo
This struct is used by ReindexRelationConcurrently to keep track of the
relations to process. This saves having to obtain some data repeatedly,
and has future uses as well.
Reviewed-by: Dmitry Dolgov <[email protected]>Reviewed-by: Hamid Akhtar <[email protected]>Reviewed-by: Masahiko Sawada <[email protected]>Discussion: https://postgr.es/m/
20201130195439[email protected]Tom Lane [Tue, 12 Jan 2021 18:37:38 +0000 (13:37 -0500)] pg_dump: label INDEX ATTACH ArchiveEntries with an owner.
Although a partitioned index's attachment to its parent doesn't
have separate ownership, the ArchiveEntry for it needs to be
marked with an owner anyway, to ensure that the ALTER command
is run by the appropriate role when restoring with
--use-set-session-authorization. Without this, the ALTER will
be run by the role that started the restore session, which will
usually work but it's formally the wrong thing.
Back- to v11 where this type of ArchiveEntry was added.
In HEAD, add equivalent commentary to the just-added TABLE ATTACH
case, which I'd made do the right thing already.
Discussion: https://postgr.es/m/
1094034.
1610418498@sss.pgh.pa.us
Tom Lane [Tue, 12 Jan 2021 17:52:14 +0000 (12:52 -0500)] Doc: fix description of privileges needed for ALTER PUBLICATION.
Adding a table to a publication requires ownership of the table
(in addition to ownership of the publication). This was mentioned
nowhere.
Alvaro Herrera [Tue, 12 Jan 2021 14:48:45 +0000 (11:48 -0300)] Fix thinko in comment
This comment has been wrong since its introduction in commit
2c03216d8311.
Author: Masahiko Sawada <
[email protected]>
Discussion: https://postgr.es/m/CAD21AoAzz6qipFJBbGEaHmyWxvvNDp8httbwLR9tUQWaTjUs2Q@mail.gmail.com
Amit Kapila [Tue, 12 Jan 2021 02:49:39 +0000 (08:19 +0530)] Fix relation descriptor .
We missed closing the relation descriptor while sending changes via the
root of partitioned relations during logical replication.
Author: Amit Langote and Mark Zhao
Reviewed-by: Amit Kapila and Ashutosh BapatBack-through: 13, where it was introduced
Discussion: https://postgr.es/m/
[email protected]Discussion: https://postgr.es/m/
[email protected]Amit Kapila [Tue, 12 Jan 2021 02:15:40 +0000 (07:45 +0530)] Optimize DropRelFileNodeBuffers() for recovery.
The recovery path of DropRelFileNodeBuffers() is optimized so that
scanning of the whole buffer pool can be avoided when the number of
blocks to be truncated in a relation is below a certain threshold. For
such cases, we find the buffers by doing lookups in BufMapping table.
This improves the performance by more than 100 times in many cases
when several small tables (tested with 1000 relations) are truncated
and where the server is configured with a large value of shared
buffers (greater than equal to 100GB).
This optimization helps cases (a) when vacuum or autovacuum truncated off
any of the empty pages at the end of a relation, or (b) when the relation is
truncated in the same transaction in which it was created.
This commit introduces a new API smgrnblocks_cached which returns a cached
value for the number of blocks in a relation fork. This helps us to determine
the exact size of relation which is required to apply this optimization. The
exact size is required to ensure that we don't leave any buffer for the
relation being dropped as otherwise the background writer or checkpointer
can lead to a PANIC error while flushing buffers corresponding to files that
don't exist.
Author: Kirk Jamison based on ideas by Amit Kapila
Reviewed-by: Kyotaro Horiguchi, Takayuki Tsunakawa, and Amit Kapila
Tested-By: Haiying Tang
Discussion: https://postgr.es/m/OSBPR01MB3207DCA7EC725FDD661B3EDAEF660@OSBPR01MB3207.jpnprd01.prod.outlook.com
Tom Lane [Tue, 12 Jan 2021 02:09:03 +0000 (21:09 -0500)] Dump ALTER TABLE ... ATTACH PARTITION as a separate ArchiveEntry.
Previously, we emitted the ATTACH PARTITION command as part of
the child table's ArchiveEntry. This was a poor choice since it
complicates restoring the partition as a standalone table; you have
to ignore the error from the ATTACH, which isn't even an option when
restoring direct-to-database with pg_restore. (pg_restore will issue
the whole ArchiveEntry as one PQexec, so that any error rolls back
the table creation as well.) Hence, separate it out as its own
ArchiveEntry, as indeed we already did for index ATTACH PARTITION
commands.
Justin Pryzby
Discussion: https://postgr.es/m/
20201023052940[email protected]Tom Lane [Tue, 12 Jan 2021 00:58:07 +0000 (19:58 -0500)] Make pg_dump's table of object-type priorities more maintainable.
Wedging a new object type into this table has historically required
manually renumbering a lot of existing entries. (Although it appears
that some people got lazy and re-used the priority level of an
existing object type, even if it wasn't particularly related.)
We can let the compiler do the counting by inventing an enum type that
lists the desired priority levels in order. Now, if you want to add
or remove a priority level, that's a one-liner.
This is not purely cosmetic, because I split apart the priorities
of DO_COLLATION and DO_TRANSFORM, as well as those of DO_ACCESS_METHOD
and DO_OPERATOR, which look to me to have been merged out of expediency
rather than because it was a good idea. Shell types continue to be
sorted interchangeably with full types, and opclasses interchangeably
with opfamilies.
Thomas Munro [Mon, 11 Jan 2021 21:55:35 +0000 (10:55 +1300)] Fix function s in dependency.h.
Commit
257836a7 accidentally deleted a couple of
redundant-but-conventional "extern" keywords on function s.
Put them back.
Reported-by: Alvaro Herrera <[email protected]>Tom Lane [Mon, 11 Jan 2021 19:53:42 +0000 (14:53 -0500)] Rethink SQLSTATE code for ERRCODE_IDLE_SESSION_TIMEOUT.
Move it to class 57 (Operator Intervention), which seems like a
better choice given that from the client's standpoint it behaves
a heck of a lot like, e.g., ERRCODE_ADMIN_SHUTDOWN.
In a green field I'd put ERRCODE_IDLE_IN_TRANSACTION_SESSION_TIMEOUT
here as well. But that's been around for a few years, so it's
probably too late to change its SQLSTATE code.
Discussion: https://postgr.es/m/
763A0689-F189-459E-946F-
F0EC4458980B@hotmail.com
Tom Lane [Mon, 11 Jan 2021 19:12:31 +0000 (14:12 -0500)] Try next host after a "cannot connect now" failure.
If a server returns ERRCODE_CANNOT_CONNECT_NOW, try the next host,
if multiple host names have been provided. This allows dealing
gracefully with standby servers that might not be in hot standby mode
yet.
In the wake of the preceding commit, it might be plausible to retry
many more error cases than we do now, but I (tgl) am hesitant to
move too aggressively on that --- it's not clear it'd be desirable
for cases such as bad-password, for example. But this case seems
safe enough.
Hubert Zhang, reviewed by Takayuki Tsunakawa
Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com
Tom Lane [Mon, 11 Jan 2021 19:03:39 +0000 (14:03 -0500)] Uniformly identify the target host in libpq connection failure reports.
Prefix "could not connect to host-or-socket-path:" to all connection
failure cases that occur after the socket() call, and remove the
ad-hoc server identity data that was appended to a few of these
messages. This should produce much more intelligible error reports
in multiple-target-host situations, especially for error cases that
are off the beaten track to any degree (because none of those provided
any server identity info).
As an example of the change, formerly a connection attempt with a bad
port number such as "psql -p 12345 -h localhost,/tmp" might produce
psql: error: could not connect to server: Connection refused
Is the server running on host "localhost" (::1) and accepting
TCP/IP connections on port 12345?
could not connect to server: Connection refused
Is the server running on host "localhost" (127.0.0.1) and accepting
TCP/IP connections on port 12345?
could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/tmp/.s.PGSQL.12345"?
Now it looks like
psql: error: could not connect to host "localhost" (::1), port 12345: Connection refused
Is the server running on that host and accepting TCP/IP connections?
could not connect to host "localhost" (127.0.0.1), port 12345: Connection refused
Is the server running on that host and accepting TCP/IP connections?
could not connect to socket "/tmp/.s.PGSQL.12345": No such file or directory
Is the server running locally and accepting connections on that socket?
This requires adjusting a couple of regression tests to allow for
variation in the contents of a connection failure message.
Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com
Tom Lane [Mon, 11 Jan 2021 18:43:19 +0000 (13:43 -0500)] Allow pg_regress.c wrappers to postprocess test result files.
Add an optional callback to regression_main() that, if provided,
is invoked on each test output file before we try to compare it
to the expected-result file.
The main and isolation test programs don't need this (yet).
In pg_regress_ecpg, add a filter that eliminates target-host
details from "could not connect" error reports. This filter
doesn't do anything as of this commit, but it will be needed
by the next one.
In the long run we might want to provide some more general,
perhaps pattern-based, filtering mechanism for test output.
For now, this will solve the immediate problem.
Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com
Tom Lane [Mon, 11 Jan 2021 18:12:09 +0000 (13:12 -0500)] In libpq, always append new error messages to conn->errorMessage.
Previously, we had an undisciplined mish-mash of printfPQExpBuffer and
appendPQExpBuffer calls to report errors within libpq. This commit
establishes a uniform rule that appendPQExpBuffer[Str] should be used.
conn->errorMessage is reset only at the start of an application request,
and then accumulates messages till we're done. We can remove no less
than three different ad-hoc mechanisms that were used to get the effect
of concatenation of error messages within a sequence of operations.
Although this makes things quite a bit cleaner conceptually, the main
reason to do it is to make the world safer for the multiple-target-host
feature that was added awhile back. Previously, there were many cases
in which an error occurring during an individual host connection attempt
would wipe out the record of what had happened during previous attempts.
(The reporting is still inadequate, in that it can be hard to tell which
host got the failure, but that seems like a matter for a separate commit.)
Currently, lo_import and lo_export contain exceptions to the "never
use printfPQExpBuffer" rule. If we changed them, we'd risk reporting
an incidental lo_close failure before the actual read or write
failure, which would be confusing, not least because lo_close happened
after the main failure. We could improve this by inventing an
internal version of lo_close that doesn't reset the errorMessage; but
we'd also need a version of PQfn() that does that, and it didn't quite
seem worth the trouble for now.
Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com
Thomas Munro [Mon, 11 Jan 2021 01:41:39 +0000 (14:41 +1300)] Use vectored I/O to fill new WAL segments.
Instead of making many block-sized write() calls to fill a new WAL file
with zeroes, make a smaller number of pwritev() calls (or various
emulations). The actual number depends on the OS's IOV_MAX, which
PG_IOV_MAX currently caps at 32. That means we'll write 256kB per call
on typical systems. We may want to tune the number later with more
experience.
Reviewed-by: Tom Lane <[email protected]>Reviewed-by: Andres Freund <[email protected]>Discussion: https://postgr.es/m/CA%2BhUKGJA%2Bu-220VONeoREBXJ9P3S94Y7J%2BkqCnTYmahvZJwM%3Dg%40mail.gmail.com
Thomas Munro [Mon, 11 Jan 2021 01:37:13 +0000 (14:37 +1300)] Provide pg_preadv() and pg_pwritev().
Provide synchronous vectored file I/O routines. These map to preadv()
and pwritev(), with fallback implementations for systems that don't have
them. Also provide a wrapper pg_pwritev_with_retry() that automatically
retries on short writes.
Reviewed-by: Tom Lane <[email protected]>Reviewed-by: Andres Freund <[email protected]>Discussion: https://postgr.es/m/CA%2BhUKGJA%2Bu-220VONeoREBXJ9P3S94Y7J%2BkqCnTYmahvZJwM%3Dg%40mail.gmail.com
Bruce Momjian [Sat, 9 Jan 2021 17:11:16 +0000 (12:11 -0500)] doc: expand description of how non-SELECT queries are processed
The previous description of how the executor processes non-SELECT
queries was very dense, causing lack of clarity. This expanded text
spells it out more simply.
Reported-by: [email protected]Discussion: https://postgr.es/m/
160912275508.676.
17469511338925622905@wrigleys.postgresql.org
Back-through: 9.5
Amit Kapila [Sat, 9 Jan 2021 10:15:38 +0000 (15:45 +0530)] Fix the test for decoding of two-phase transactions.
Commit
5a3574d7b3 added the test for decoding of two-phase transactions
during the build of a consistent snapshot. The test forgot to skip empty
xacts which can lead to decoding of extra empty transactions due to
background activity by autovacuum.
Per report by buildfarm.
Reported-by: Tom LaneDiscussion: https://postgr.es/m/363512.
1610171267@sss.pgh.pa.us
Tom Lane [Fri, 8 Jan 2021 23:12:07 +0000 (18:12 -0500)] Fix plpgsql tests for debug_invalidate_system_caches_always.
Commit
c9d529848 resulted in having a couple more places where
the error context stack for a failure varies depending on
debug_invalidate_system_caches_always (nee CLOBBER_CACHE_ALWAYS).
This is not very surprising, since we have to re-parse cached
plans if the plan cache is clobbered. Stabilize the expected
test output by hiding the context stack in these places,
as we've done elsewhere in this test script.
(Another idea worth considering, now that we have
debug_invalidate_system_caches_always, is to force it to zero for
these test cases. That seems like it'd risk reducing the coverage
of cache-clobber testing, which might or might not be worth being
able to verify that we get the expected error output in normal
cases. For the moment I just stuck with the existing technique.)
In passing, update comments that referred to CLOBBER_CACHE_ALWAYS.
Per buildfarm member hyrax.
Tom Lane [Fri, 8 Jan 2021 17:16:00 +0000 (12:16 -0500)] Fix ancient bug in parsing of BRE-mode regular expressions.
brenext(), when parsing a '*' quantifier, forgot to return any "value"
for the token; per the equivalent case in next(), it should return
value 1 to indicate that greedy rather than non-greedy behavior is
wanted. The result is that the compiled regexp could behave like 'x*?'
rather than the intended 'x*', if we were unlucky enough to have
a zero in v->nextvalue at this point. That seems to happen with some
reliability if we have '.*' at the beginning of a BRE-mode regexp,
although that depends on the initial contents of a stack-allocated
struct, so it's not guaranteed to fail.
Found by Alexander Lakhin using valgrind testing. This bug seems
to be aboriginal in Spencer's code, so back- all the way.
Discussion: https://postgr.es/m/16814-
6c5e3edd2bdf0d50@postgresql.org
Michael Paquier [Fri, 8 Jan 2021 01:37:03 +0000 (10:37 +0900)] Fix and simplify some code related to cryptohashes
This commit addresses two issues:
- In pgcrypto, MD5 computation called pg_cryptohash_{init,update,final}
without checking for the result status.
- Simplify pg_checksum_raw_context to use only one variable for all the
SHA2 options available in checksum manifests.
Reported-by: Heikki LinnakangasDiscussion: https://postgr.es/m/
f62f26bb-47a5-8411-46e5-
4350823e06a5@iki.fi
Tom Lane [Fri, 8 Jan 2021 01:36:09 +0000 (20:36 -0500)] Adjust createdb TAP tests to work on recent OpenBSD.
We found last February that the error-case tests added by commit
008cf0409 failed on OpenBSD, because that platform doesn't really
check locale names. At the time it seemed that that was only an issue
for LC_CTYPE, but testing on a more recent version of OpenBSD shows
that it's now equally lax about LC_COLLATE.
Rather than dropping the LC_COLLATE test too, put back LC_CTYPE
(reverting
c4b0edb07), and adjust these tests to accept the different
error message that we get if setlocale() doesn't reject a bogus locale
name. The point of these tests is not really what the backend does
with the locale name, but to show that createdb quotes funny locale
names safely; so we're not losing test reliability this way.
Back- as appropriate.
Discussion: https://postgr.es/m/231373.
1610058324@sss.pgh.pa.us
Tom Lane [Thu, 7 Jan 2021 16:45:08 +0000 (11:45 -0500)] Further second thoughts about idle_session_timeout .
On reflection, the order of operations in PostgresMain() is wrong.
These timeouts ought to be shut down before, not after, we do the
post-command-read CHECK_FOR_INTERRUPTS, to guarantee that any
timeout error will be detected there rather than at some ill-defined
later point (possibly after having wasted a lot of work).
This is really an error in the original idle_in_transaction_timeout
, so back- to 9.6 where that was introduced.
Tomas Vondra [Thu, 7 Jan 2021 16:31:58 +0000 (17:31 +0100)] Minor fixes in COPY progress docs
Author: Justin Pryzby
Discussion: https://postgr.es/m/CAFp7Qwr6_FmRM6pCO0x_a0mymOfX_Gg+FEKet4XaTGSW=LitKQ@mail.gmail.com
Fujii Masao [Thu, 7 Jan 2021 15:47:03 +0000 (00:47 +0900)] Add GUC to log long wait times on recovery conflicts.
This commit adds GUC log_recovery_conflict_waits that controls whether
a log message is produced when the startup process is waiting longer than
deadlock_timeout for recovery conflicts. This is useful in determining
if recovery conflicts prevent the recovery from applying WAL.
Note that currently a log message is produced only when recovery conflict
has not been resolved yet even after deadlock_timeout passes, i.e.,
only when the startup process is still waiting for recovery conflict
even after deadlock_timeout.
Author: Bertrand Drouvot, Masahiko Sawada
Reviewed-by: Alvaro Herrera, Kyotaro Horiguchi, Fujii MasaoDiscussion: https://postgr.es/m/
9a60178c-a853-1440-2cdc-
c3af916cff59@amazon.com
Tom Lane [Thu, 7 Jan 2021 03:09:00 +0000 (22:09 -0500)] Fix bogus link in test comments.
I apparently copied-and-pasted the wrong link in commit
ca8217c10.
Point it where it was meant to go.
Tom Lane [Thu, 7 Jan 2021 03:02:35 +0000 (22:02 -0500)] Improve commentary in timeout.c.
On re-reading I realized that I'd missed one race condition in the new
timeout code. It's safe, but add a comment explaining it.
Discussion: https://postgr.es/m/CA+hUKG+o6pbuHBJSGnud=TadsuXySWA7CCcPgCt2QE9F6_4iHQ@mail.gmail.com
Michael Paquier [Thu, 7 Jan 2021 01:21:02 +0000 (10:21 +0900)] Fix allocation logic of cryptohash context data with OpenSSL
The allocation of the cryptohash context data when building with OpenSSL
was happening in the memory context of the caller of
pg_cryptohash_create(), which could lead to issues with resowner cleanup
if cascading resources are cleaned up on an error. Like other
facilities using resowners, move the base allocation to TopMemoryContext
to ensure a correct cleanup on failure.
The resulting code gets simpler with this commit as the context data is
now hold by a unique opaque pointer, so as there is only one single
allocation done in TopMemoryContext.
After discussion, also change the cryptohash subroutines to return an
error if the caller provides NULL for the context data to ease error
detection on OOM.
Author: Heikki Linnakangas
Discussion: https://postgr.es/m/
[email protected]Tom Lane [Wed, 6 Jan 2021 23:28:42 +0000 (18:28 -0500)] Add idle_session_timeout.
This GUC variable works much like idle_in_transaction_session_timeout,
in that it kills sessions that have waited too long for a new client
query. But it applies when we're not in a transaction, rather than
when we are.
Li Japin, reviewed by David Johnston and Hayato Kuroda, some
fixes by me
Discussion: https://postgr.es/m/
763A0689-F189-459E-946F-
F0EC4458980B@hotmail.com
Tom Lane [Wed, 6 Jan 2021 23:23:03 +0000 (18:23 -0500)] Improve timeout.c's handling of repeated timeout set/cancel.
A very common usage pattern is that we set a timeout that we don't
expect to reach, cancel it after a little bit, and later repeat.
With the original implementation of timeout.c, this results in one
setitimer() call per timeout set or cancel. We can do a lot better
by being lazy about changing the timeout interrupt request, namely:
(1) never cancel the outstanding interrupt, even when we have no
active timeout events;
(2) if we need to set an interrupt, but there already is one pending
at or before the required time, leave it alone. When the interrupt
happens, the signal handler will reschedule it at whatever time is
then needed.
For example, with a one-second setting for statement_timeout, this
method results in having to interact with the kernel only a little
more than once a second, no matter how many statements we execute
in between. The mainline code might never call setitimer() at all
after the first time, while each time the signal handler fires,
it sees that the then-pending request is most of a second away,
and that's when it sets the next interrupt request for. Each
mainline timeout-set request after that will observe that the time
it wants is past the pending interrupt request time, and do nothing.
This also works pretty well for cases where a few different timeout
lengths are in use, as long as none of them are very short. But
that describes our usage well.
Idea and original by Thomas Munro; I fixed a race condition
and improved the comments.
Discussion: https://postgr.es/m/CA+hUKG+o6pbuHBJSGnud=TadsuXySWA7CCcPgCt2QE9F6_4iHQ@mail.gmail.com
Tomas Vondra [Wed, 6 Jan 2021 20:46:26 +0000 (21:46 +0100)] Report progress of COPY commands
This commit introduces a view pg_stat_progress_copy, reporting progress
of COPY commands. This allows rough estimates how far a running COPY
progressed, with the caveat that the total number of bytes may not be
available in some cases (e.g. when the input comes from the client).
Author: Josef Šimánek
Reviewed-by: Fujii Masao, Bharath Rupireddy, Vignesh C, Matthias van de Meent
Discussion: https://postgr.es/m/CAFp7QwqMGEi4OyyaLEK9DR0+E+oK3UtA4bEjDVCa4bNkwUY2PQ@mail.gmail.com
Discussion: https://postgr.es/m/CAFp7Qwr6_FmRM6pCO0x_a0mymOfX_Gg+FEKet4XaTGSW=LitKQ@mail.gmail.com
Tom Lane [Wed, 6 Jan 2021 15:51:14 +0000 (10:51 -0500)] Add a test module for the regular expression package.
This module provides a function test_regex() that is functionally
rather like regexp_matches(), but with additional debugging-oriented
options and additional output. The debug options are somewhat obscure;
they are chosen to match the API of the test harness that Henry Spencer
wrote way-back-when for use in Tcl. With this, we can import all the
test cases that Spencer wrote originally, even for regex functionality
that we don't currently expose in Postgres. This seems necessary
because we can no longer rely on Tcl to act as upstream and verify
any fixes or improvements that we make.
In addition to Spencer's tests, I added a few for lookbehind
constraints (which we added in 2015, and Tcl still hasn't absorbed)
that are modeled on his tests for lookahead constraints. After looking
at code coverage reports, I also threw in a couple of tests to more
fully exercise our "high colormap" logic.
According to my testing, this brings the check-world coverage
for src/backend/regex/ from 71.1% to 86.7% of lines.
(coverage.postgresql.org shows a slightly different number,
which I think is because it measures a non-assert build.)
Discussion: https://postgr.es/m/
2873268.
1609732164@sss.pgh.pa.us
Peter Eisentraut [Wed, 6 Jan 2021 09:15:19 +0000 (10:15 +0100)] Replace CLOBBER_CACHE_ALWAYS with run-time GUC
Forced cache invalidation (CLOBBER_CACHE_ALWAYS) has been impractical
to use for testing in PostgreSQL because it's so slow and because it's
toggled on/off only at build time. It is helpful when hunting bugs in
any code that uses the sycache/relcache because causes cache
invalidations to be injected whenever it would be possible for an
invalidation to occur, whether or not one was really pending.
Address this by providing run-time control over cache clobber
behaviour using the new debug_invalidate_system_caches_always GUC.
Support is not compiled in at all unless assertions are enabled or
CLOBBER_CACHE_ENABLED is explicitly defined at compile time. It
defaults to 0 if compiled in, so it has negligible effect on assert
build performance by default.
When support is compiled in, test code can now set
debug_invalidate_system_caches_always=1 locally to a backend to test
specific queries, functions, extensions, etc. Or tests can toggle it
globally for a specific test case while retaining normal performance
during test setup and teardown.
For backwards compatibility with existing test harnesses and scripts,
debug_invalidate_system_caches_always defaults to 1 if
CLOBBER_CACHE_ALWAYS is defined, and to 3 if CLOBBER_CACHE_RECURSIVE
is defined.
CLOBBER_CACHE_ENABLED is now visible in pg_config_manual.h, as is the
related RECOVER_RELATION_BUILD_MEMORY setting for the relcache.
Author: Craig Ringer <
[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CAMsr+YF=+ctXBZj3ywmvKNUjWpxmuTuUKuv-rgbHGX5i5pLstQ@mail.gmail.com
Fujii Masao [Wed, 6 Jan 2021 03:39:18 +0000 (12:39 +0900)] Detect the deadlocks between backends and the startup process.
The deadlocks that the recovery conflict on lock is involved in can
happen between hot-standby backends and the startup process.
If a backend takes an access exclusive lock on the table and which
finally triggers the deadlock, that deadlock can be detected
as expected. On the other hand, previously, if the startup process
took an access exclusive lock and which finally triggered the deadlock,
that deadlock could not be detected and could remain even after
deadlock_timeout passed. This is a bug.
The cause of this bug was that the code for handling the recovery
conflict on lock didn't take care of deadlock case at all. It assumed
that deadlocks involving the startup process and backends were able
to be detected by the deadlock detector invoked within backends.
But this assumption was incorrect. The startup process also should
have invoked the deadlock detector if necessary.
To fix this bug, this commit makes the startup process invoke
the deadlock detector if deadlock_timeout is reached while handling
the recovery conflict on lock. Specifically, in that case, the startup
process requests all the backends holding the conflicting locks to
check themselves for deadlocks.
Back- to v9.6. v9.5 has also this bug, but per discussion we decided
not to back- the fix to v9.5. Because v9.5 doesn't have some
infrastructure codes (e.g.,
37c54863cf) that this bug fix depends on.
We can apply those codes for the back-, but since the next minor
version release is the final one for v9.5, it's risky to do that. If we
unexpectedly introduce new bug to v9.5 by the back-, there is no
chance to fix that. We determined that the back- to v9.5 would give
more risk than gain.
Author: Fujii Masao
Reviewed-by: Bertrand Drouvot, Masahiko Sawada, Kyotaro HoriguchiDiscussion: https://postgr.es/m/
4041d6b6-cf24-a120-36fa-
1294220f8243@oss.nttdata.com
Amit Kapila [Wed, 6 Jan 2021 03:26:19 +0000 (08:56 +0530)] Fix typos in decode.c and logical.c.
Per report by Ajin Cherian in email:
https://postgr.es/m/CAFPTHDYnRKDvzgDxoMn_CKqXA-D0MtrbyJvfvjBsO4G=UHDXkg@mail.gmail.com
Fujii Masao [Wed, 6 Jan 2021 02:58:23 +0000 (11:58 +0900)] doc: Fix description about default behavior of recovery_target_timeline.
The default value of recovery_target_timeline was changed in v12,
but the description about the default behavior of that was not updated.
Back- to v12 where the default behavior of recovery_target_timeline
was changed.
Author: Benoit Lobréau
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CAPE8EZ7c3aruEmM24GYkj8y8WmHKD1m9TtPtgCF0nQ3zw4LCkQ@mail.gmail.com
Michael Paquier [Wed, 6 Jan 2021 01:52:26 +0000 (10:52 +0900)] Promote --data-checksums to the common set of options in initdb --help
This was previously part of the section dedicated to less common
options, but it is an option commonly used these days.
Author: Michael Banck
Reviewed-by: Stephen Frost, Michael PaquierDiscussion: https://postgr.es/m/
d7938aca4d4ea8e8c72c33bd75efe9f8218fe390[email protected]Tom Lane [Wed, 6 Jan 2021 00:03:56 +0000 (19:03 -0500)] Revert unstable test cases from commit
7d80441d2.
I momentarily forgot that the "owner" column wouldn't be stable
in the buildfarm. Oh well, these tests weren't very valuable
anyway.
Discussion: https://postgr.es/m/
20201130165436[email protected]Tom Lane [Tue, 5 Jan 2021 23:41:50 +0000 (18:41 -0500)] Allow psql's \dt and \di to show TOAST tables and their indexes.
Formerly, TOAST objects were unconditionally suppressed, but since
\d is able to print them it's not very clear why these variants
should not. Instead, use the same rules as for system catalogs:
they can be seen if you write the 'S' modifier or a table name
pattern. (In practice, since hardly anybody would keep pg_toast
in their search_path, it's really down to whether you use a pattern
that can match pg_toast.*.)
No docs change seems necessary because the docs already say that
this happens for "system objects"; we're just classifying TOAST
tables as being that.
Justin Pryzby, reviewed by Laurenz Albe
Discussion: https://postgr.es/m/
20201130165436[email protected]Tom Lane [Tue, 5 Jan 2021 21:18:01 +0000 (16:18 -0500)] Introduce a new GUC_REPORT setting "in_hot_standby".
Aside from being queriable via SHOW, this value is sent to the client
immediately at session startup, and again later on if the server gets
promoted to primary during the session. The immediate report will be
used in an upcoming to avoid an extra round trip when trying to
connect to a primary server.
Haribabu Kommi, Greg Nancarrow, Tom Lane; reviewed at various times
by Laurenz Albe, Takayuki Tsunakawa, Peter Smith.
Discussion: https://postgr.es/m/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
Bruce Momjian [Tue, 5 Jan 2021 19:26:37 +0000 (14:26 -0500)] doc: improve NLS instruction wording
Reported-by: "Tang, Haiying"Discussion: https://postgr.es/m/
bbbccf7a3c2d436e85d45869d612fd6b@G08CNEXMBPEKD05.g08.fujitsu.local
Author: "Tang, Haiying"
Back-through: 9.5
Dean Rasheed [Tue, 5 Jan 2021 11:52:42 +0000 (11:52 +0000)] Add an explicit cast to double when using fabs().
Commit
bc43b7c2c0 used fabs() directly on an int variable, which
apparently requires an explicit cast on some platforms.
Per buildfarm.
Dean Rasheed [Tue, 5 Jan 2021 11:15:28 +0000 (11:15 +0000)] Fix numeric_power() when the exponent is INT_MIN.
In power_var_int(), the computation of the number of significant
digits to use in the computation used log(Abs(exp)), which isn't safe
because Abs(exp) returns INT_MIN when exp is INT_MIN. Use fabs()
instead of Abs(), so that the exponent is cast to a double before the
absolute value is taken.
Back- to 9.6, where this was introduced (by
7d9a4737c2).
Discussion: https://postgr.es/m/CAEZATCVd6pMkz=BrZEgBKyqqJrt2xghr=fNc8+Z=5xC6cgWrWA@mail.gmail.com
Amit Kapila [Tue, 5 Jan 2021 09:35:22 +0000 (15:05 +0530)] Test decoding of two-phase transactions during the build of a consistent snapshot.
Commit
a271a1b50e added the capability to allow decoding at prepare time.
This adds an isolation testcase to test that decoding happens at commit
time when the consistent snapshot state is reached after prepare but
before commit prepared.
Author: Ajin Cherian
Reviewed-by: Amit KapilaDiscussion:
https://postgr.es/m/
02DA5F5E-CECE-4D9C-8B4B-
418077E2C010@postgrespro.ru
https://postgr.es/m/CAMGcDxeqEpWj3fTXwqhSwBdXd2RS9jzwWscO-XbeCfso6ts3+Q@mail.gmail.com
Peter Eisentraut [Tue, 5 Jan 2021 07:17:40 +0000 (08:17 +0100)] doc: Document how to run regression tests with custom server settings
Author: Craig Ringer <
[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CAMsr+YF=+ctXBZj3ywmvKNUjWpxmuTuUKuv-rgbHGX5i5pLstQ@mail.gmail.com
Peter Geoghegan [Tue, 5 Jan 2021 03:46:11 +0000 (19:46 -0800)] Standardize one aspect of rmgr desc output.
Bring heap and hash rmgr desc output in line with nbtree and GiST desc
output by using the name latestRemovedXid for all fields that output the
contents of the latestRemovedXid field from the WAL record's C struct
(stop using local variants).
This seems like a clear improvement because latestRemovedXid is a symbol
name that already appears across many different source files, and so is
probably much more recognizable.
Discussion: https://postgr.es/m/CAH2-Wzkt_Rs4VqPSCk87nyjPAAEmWL8STU9zgET_83EF5YfrLw@mail.gmail.com
Amit Kapila [Tue, 5 Jan 2021 02:35:08 +0000 (08:05 +0530)] Fix typo in origin.c.
Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+PsReyuvww_Fn1NN_Vsv0wBP1bnzuhzRFr_2=y1nNZrG7w@mail.gmail.com
Amit Kapila [Tue, 5 Jan 2021 02:26:40 +0000 (07:56 +0530)] Fix typo in reorderbuffer.c.
Author: Zhijie Hou
Reviewed-by: Sawada MasahikoDiscussion: https://postgr.es/m/
ba88bb58aaf14284abca16aec04bf279@G08CNEXMBPEKD05.g08.fujitsu.local
Thomas Munro [Tue, 5 Jan 2021 01:00:16 +0000 (14:00 +1300)] Replace remaining uses of "whitelist".
Instead describe the action that the list effects, or just use "list"
where the meaning is obvious from context.
Author: Dagfinn Ilmari Mannsåker <
[email protected]>
Discussion: https://postgr.es/m/
20200615182235.x7lch5n6kcjq4aue%40alap3.anarazel.de
Thomas Munro [Tue, 5 Jan 2021 00:27:06 +0000 (13:27 +1300)] pgindent: whitelist/blacklist -> additional/excluded.
Author: Dagfinn Ilmari Mannsåker <
[email protected]>
Discussion: https://postgr.es/m/
20200615182235.x7lch5n6kcjq4aue%40alap3.anarazel.de
Thomas Munro [Mon, 4 Jan 2021 23:06:15 +0000 (12:06 +1300)] Rename "enum blacklist" to "uncommitted enums".
We agreed to remove this terminology and use something more descriptive.
Discussion: https://postgr.es/m/
20200615182235.x7lch5n6kcjq4aue%40alap3.anarazel.de
Tom Lane [Mon, 4 Jan 2021 23:32:40 +0000 (18:32 -0500)] Fix integer-overflow corner cases in substring() functions.
If the substring start index and length overflow when added together,
substring() misbehaved, either throwing a bogus "negative substring
length" error on a case that should succeed, or failing to complain that
a negative length is negative (and instead returning the whole string,
in most cases). Unsurprisingly, the text, bytea, and bit variants of
the function all had this issue. Rearrange the logic to ensure that
negative lengths are always rejected, and add an overflow check to
handle the other case.
Also install similar guards into detoast_attr_slice() (nee
heap_tuple_untoast_attr_slice()), since it's far from clear that
no other code paths leading to that function could pass it values
that would overflow.
by myself and Pavel Stehule, per bug #16804 from Rafi Shamim.
Back- to v11. While these bugs are old, the common/int.h
infrastructure for overflow-detecting arithmetic didn't exist before
commit
4d6ad3125, and it doesn't seem like these misbehaviors are bad
enough to justify developing a standalone fix for the older branches.
Discussion: https://postgr.es/m/16804-
f4eeeb6c11ba71d4@postgresql.org
Thomas Munro [Mon, 4 Jan 2021 22:40:03 +0000 (11:40 +1300)] Remove unused function s.
Cleanup for commit
dee663f7.
Reported-by: Tomas Vondra <[email protected]>Discussion: https://postgr.es/m/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ@mail.gmail.com
Tom Lane [Mon, 4 Jan 2021 17:39:27 +0000 (12:39 -0500)] Rethink the "read/write parameter" mechanism in pl/pgsql.
Performance issues with the preceding to re-implement array
element assignment within pl/pgsql led me to realize that the read/write
parameter mechanism is misdesigned. Instead of requiring the assignment
source expression to be such that *all* its references to the target
variable could be passed as R/W, we really want to identify *one*
reference to the target variable to be passed as R/W, allowing any other
ones to be passed read/only as they would be by default. As long as the
R/W reference is a direct argument to the top-level (hence last to be
executed) function in the expression, there is no harm in R/O references
being passed to other lower parts of the expression. Nor is there any
use-case for more than one argument of the top-level function being R/W.
Hence, rewrite that logic to identify one single Param that references
the target variable, and make only that Param pass a read/write
reference, not any other Params referencing the target variable.
Discussion: https://postgr.es/m/
4165684.
1607707277@sss.pgh.pa.us
Tom Lane [Mon, 4 Jan 2021 17:14:37 +0000 (12:14 -0500)] Remove PLPGSQL_DTYPE_ARRAYELEM datum type within pl/pgsql.
In the wake of the previous commit, we don't really need this anymore,
since array assignment is primarily handled by the core code.
The only way that that code could still be reached is that a GET
DIAGNOSTICS target variable could be an array element. But that
doesn't seem like a particularly essential feature. I'd added it
in commit
55caaaeba, but just because it was easy not because
anyone had actually asked for it. Hence, revert that and
then remove the now-unreachable stuff. (If we really had to,
we could probably reimplement GET DIAGNOSTICS using the new
assignment machinery; but the cost/benefit ratio looks very poor,
and it'd likely be a bit slower.)
Note that PLPGSQL_DTYPE_RECFIELD remains. It's possible that we
could get rid of that too, but maintaining the existing behaviors
for RECORD-type variables seems like it might be difficult. Since
there's not any functional limitation in those code paths as there
was in the ARRAYELEM code, I've not pursued the idea.
Discussion: https://postgr.es/m/
4165684.
1607707277@sss.pgh.pa.us
Tom Lane [Mon, 4 Jan 2021 16:52:00 +0000 (11:52 -0500)] Re-implement pl/pgsql's expression and assignment parsing.
Invent new RawParseModes that allow the core grammar to handle
pl/pgsql expressions and assignments directly, and thereby get rid
of a lot of hackery in pl/pgsql's parser. This moves a good deal
of knowledge about pl/pgsql into the core code: notably, we have to
invent a CoercionContext that matches pl/pgsql's (rather dubious)
historical behavior for assignment coercions. That's getting away
from the original idea of pl/pgsql as an arm's-length extension of
the core, but really we crossed that bridge a long time ago.
The main advantage of doing this is that we can now use the core
parser to generate FieldStore and/or SubscriptingRef nodes to handle
assignments to pl/pgsql variables that are records or arrays. That
fixes a number of cases that had never been implemented in pl/pgsql
assignment, such as nested records and array slicing, and it allows
pl/pgsql assignment to support the datatype-specific subscripting
behaviors introduced in commit
c7aba7c14.
There are cosmetic benefits too: when a syntax error occurs in a
pl/pgsql expression, the error report no longer includes the confusing
"SELECT" keyword that used to get prefixed to the expression text.
Also, there seem to be some small speed gains.
Discussion: https://postgr.es/m/
4165684.
1607707277@sss.pgh.pa.us
Tom Lane [Mon, 4 Jan 2021 16:03:22 +0000 (11:03 -0500)] Add the ability for the core grammar to have more than one parse target.
This essentially allows gram.y to implement a family of related
syntax trees, rather than necessarily always parsing a list of SQL
statements. raw_parser() gains a new argument, enum RawParseMode,
to say what to do. As proof of concept, add a mode that just parses
a TypeName without any other decoration, and use that to greatly
simplify typeStringToTypeName().
In addition, invent a new SPI entry point SPI_prepare_extended() to
allow SPI users (particularly plpgsql) to get at this new functionality.
In hopes of making this the last variant of SPI_prepare(), set up its
additional arguments as a struct rather than direct arguments, and
promise that future additions to the struct can default to zero.
SPI_prepare_cursor() and SPI_prepare_params() can perhaps go away at
some point.
Discussion: https://postgr.es/m/
4165684.
1607707277@sss.pgh.pa.us
Michael Paquier [Mon, 4 Jan 2021 10:47:58 +0000 (19:47 +0900)] Simplify some comments in xml.c
Author: Justin Pryzby
Discussion: https://postgr.es/m/X/
[email protected]Amit Kapila [Mon, 4 Jan 2021 03:04:50 +0000 (08:34 +0530)] Allow decoding at prepare time in ReorderBuffer.
This allows PREPARE-time decoding of two-phase transactions (if the
output plugin supports this capability), in which case the transactions
are replayed at PREPARE and then committed later when COMMIT PREPARED
arrives.
Now that we decode the changes before the commit, the concurrent aborts
may cause failures when the output plugin consults catalogs (both system
and user-defined).
We detect such failures with a special sqlerrcode
ERRCODE_TRANSACTION_ROLLBACK introduced by commit
7259736a6e and stop
decoding the remaining changes. Then we rollback the changes when rollback
prepared is encountered.
Author: Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich
Reviewed-by: Amit Kapila, Peter Smith, Sawada Masahiko, Arseny Sher, and Dilip KumarTested-by: Takamichi OsumiDiscussion:
https://postgr.es/m/
02DA5F5E-CECE-4D9C-8B4B-
418077E2C010@postgrespro.ru
https://postgr.es/m/CAMGcDxeqEpWj3fTXwqhSwBdXd2RS9jzwWscO-XbeCfso6ts3+Q@mail.gmail.com
Bruce Momjian [Sat, 2 Jan 2021 18:06:25 +0000 (13:06 -0500)] Update copyright for 2021
Back-through: 9.5
Tom Lane [Fri, 1 Jan 2021 20:51:09 +0000 (15:51 -0500)] Doc: improve explanation of EXTRACT(EPOCH) for timestamp without tz.
Try to be clearer about what computation is actually happening here.
Per bug #16797 from Dana Burd.
Discussion: https://postgr.es/m/16797-
f264b0b980b53b8b@postgresql.org
Peter Geoghegan [Thu, 31 Dec 2020 01:21:42 +0000 (17:21 -0800)] Get heap page max offset with buffer lock held.
On further reflection it seems better to call PageGetMaxOffsetNumber()
after acquiring a buffer lock on the page. This shouldn't really
matter, but doing it this way is cleaner.
Follow-up to commit
42288174.
Back: 12-, just like commit
42288174Peter Geoghegan [Thu, 31 Dec 2020 00:29:05 +0000 (16:29 -0800)] Fix index deletion latestRemovedXid bug.
The logic for determining the latest removed XID for the purposes of
generating recovery conflicts in REDO routines was subtly broken. It
failed to follow links from HOT chains, and so failed to consider all
relevant heap tuple headers in some cases.
To fix, expand the loop that deals with LP_REDIRECT line pointers to
also deal with HOT chains. The new version of the loop is loosely based
on a similar loop from heap_prune_chain().
The impact of this bug is probably quite limited, since the horizon code
necessarily deals with heap tuples that are pointed to by LP_DEAD-set
index tuples. The process of setting LP_DEAD index tuples (e.g. within
the kill_prior_tuple mechanism) is highly correlated with opportunistic
pruning of pointed-to heap tuples. Plus the question of generating a
recovery conflict usually comes up some time after index tuple LP_DEAD
bits were initially set, unlike heap pruning, where a latestRemovedXid
is generated at the point of the pruning operation (heap pruning has no
deferred "would-be page split" style processing that produces conflicts
lazily).
Only back to Postgres 12, the first version where this logic runs
during original execution (following commit
558a9165e08). The index
latestRemovedXid mechanism has had the same bug since it first appeared
over 10 years ago (in commit
a760893d), but backing to all
supported versions now seems like a bad idea on balance. Running the
new improved code during recovery seems risky, especially given the lack
of complaints from the field.
Author: Peter Geoghegan <
[email protected]>
Discussion: https://postgr.es/m/CAH2-Wz=Eib393+HHcERK_9MtgNS7Ew1HY=RDC_g6GL46zM5C6Q@mail.gmail.com
Back: 12-
Tom Lane [Wed, 30 Dec 2020 22:48:43 +0000 (17:48 -0500)] Doc: spell out comparison behaviors for the date/time types.
The behavior of cross-type comparisons among date/time data types was
not really explained anywhere. You could probably infer it if you
recognized the applicability of comments elsewhere about datatype
conversions, but it seems worthy of explicit documentation.
Per bug #16797 from Dana Burd.
Discussion: https://postgr.es/m/16797-
f264b0b980b53b8b@postgresql.org
Tom Lane [Wed, 30 Dec 2020 19:15:41 +0000 (14:15 -0500)] More fixups for pg_upgrade cross-version tests.
Commit
7ca37fb04 removed regress_putenv from the regress.so library,
so reloading a SQL function dependent on that would not work.
Fix similarly to
52202bb39.
Per buildfarm.
Alexander Korotkov [Wed, 30 Dec 2020 18:11:31 +0000 (21:11 +0300)] Refactor multirange_in()
This commit preserves the logic of multirange_in() but makes it more clear
what's going on. Also, this commit fixes the compiler warning spotted by the
buildfarm.
Reported-by: Tom LaneDiscussion: https://postgr.es/m/
2246043.
1609290699%40sss.pgh.pa.us
Tom Lane [Wed, 30 Dec 2020 17:55:59 +0000 (12:55 -0500)] Use setenv() in preference to putenv().
Since at least 2001 we've used putenv() and avoided setenv(), on the
grounds that the latter was unportable and not in POSIX. However,
POSIX added it that same year, and by now the situation has reversed:
setenv() is probably more portable than putenv(), since POSIX now
treats the latter as not being a core function. And setenv() has
cleaner semantics too. So, let's reverse that old policy.
This commit adds a simple src/port/ implementation of setenv() for
any stragglers (we have one in the buildfarm, but I'd not be surprised
if that code is never used in the field). More importantly, extend
win32env.c to also support setenv(). Then, replace usages of putenv()
with setenv(), and get rid of some ad-hoc implementations of setenv()
wannabees.
Also, adjust our src/port/ implementation of unsetenv() to follow the
POSIX spec that it returns an error indicator, rather than returning
void as per the ancient BSD convention. I don't feel a need to make
all the call sites check for errors, but the portability stub ought
to match real-world practice.
Discussion: https://postgr.es/m/
2065122.
1609212051@sss.pgh.pa.us
Alexander Korotkov [Wed, 30 Dec 2020 17:19:15 +0000 (20:19 +0300)] Fix selectivity estimation @> (anymultirange, anyrange) operator
Attempt to get selectivity estimation for @> (anymultirange, anyrange) operator
caused an error in buildfarm, because this operator was missed in switch()
of calc_hist_selectivity(). Fix that and also make regression tests reliably
check that selectivity estimation for (multi)ranges doesn't fall. Previously,
whether we test selectivity estimation for (multi)ranges depended on
whether autovacuum managed to gather concurrently to the test.
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/X%2BwmgjRItuvHNBeV%40paquier.xyz
Tom Lane [Wed, 30 Dec 2020 16:38:42 +0000 (11:38 -0500)] Fix up usage of krb_server_keyfile GUC parameter.
secure_open_gssapi() installed the krb_server_keyfile setting as
KRB5_KTNAME unconditionally, so long as it's not empty. However,
pg_GSS_recvauth() only installed it if KRB5_KTNAME wasn't set already,
leading to a troubling inconsistency: in theory, clients could see
different sets of server principal names depending on whether they
use GSSAPI encryption. Always using krb_server_keyfile seems like
the right thing, so make both places do that. Also fix up
secure_open_gssapi()'s lack of a check for setenv() failure ---
it's unlikely, surely, but security-critical actions are no place
to be sloppy.
Also improve the associated documentation.
This does nothing about secure_open_gssapi()'s use of setenv(),
and indeed causes pg_GSS_recvauth() to use it too. That's nominally
against project portability rules, but since this code is only built
with --with-gssapi, I do not feel a need to do something about this
in the back branches. A fix will be forthcoming for HEAD though.
Back- to v12 where GSSAPI encryption was introduced. The
dubious behavior in pg_GSS_recvauth() goes back further, but it
didn't have anything to be inconsistent with, so let it be.
Discussion: https://postgr.es/m/
2187460.
1609263156@sss.pgh.pa.us
Michael Paquier [Wed, 30 Dec 2020 12:23:24 +0000 (21:23 +0900)] Sanitize IF NOT EXISTS in EXPLAIN for CTAS and matviews
IF NOT EXISTS was ignored when specified in an EXPLAIN query for CREATE
MATERIALIZED VIEW or CREATE TABLE AS. Hence, if this clause was
specified, the caller would get a failure if the relation already
exists instead of a success with a NOTICE message.
This commit makes the behavior of IF NOT EXISTS in EXPLAIN consistent
with the non-EXPLAIN'd DDL queries, preventing a failure with IF NOT
EXISTS if the relation to-be-created already exists. The skip is done
before the SELECT query used for the relation is planned or executed,
and a "dummy" plan is generated instead depending on the format used by
EXPLAIN.
Author: Bharath Rupireddy
Reviewed-by: Zhijie Hou, Michael Paquier
Discussion: https://postgr.es/m/CALj2ACVa3oJ9O_wcGd+FtHWZds04dEKcakxphGz5POVgD4wC7Q@mail.gmail.com