From b706172d22a29962c42c90f2c3c704c2a4141660 Mon Sep 17 00:00:00 2001
From: Bruce Momjian <bruce@momjian.us>
Date: Tue, 31 Oct 2023 11:42:02 -0400
Subject: [PATCH] C comment:  improve statistics computation comment example

Discussion: https://postgr.es/m/CAKFQuwbD672Sc0EXv0ifx3pzfQ5UAEpiAeaBGKz_Ox-4d2NGCA@mail.gmail.com

Author: David G. Johnston

Backpatch-through: master
---
 doc/src/sgml/planstats.sgml | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index 43ad57253e..c7ec749d0a 100644
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -389,18 +389,20 @@ tablename  | null_frac | n_distinct | most_common_vals
 </programlisting>
 
    In this case there is no <acronym>MCV</acronym> information for
-   <structfield>unique2</structfield> because all the values appear to be
-   unique, so we use an algorithm that relies only on the number of
-   distinct values for both relations together with their null fractions:
+   <structname>unique2</structname> and all the values appear to be
+   unique (n_distinct = -1), so we use an algorithm that relies on the row
+   count estimates for both relations (num_rows, not shown, but "tenk")
+   together with the column null fractions (zero for both):
 
 <programlisting>
-selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1, 1/num_distinct2)
+selectivity = (1 - null_frac1) * (1 - null_frac2) / max(num_rows1, num_rows2)
             = (1 - 0) * (1 - 0) / max(10000, 10000)
             = 0.0001
 </programlisting>
 
    This is, subtract the null fraction from one for each of the relations,
-   and divide by the maximum of the numbers of distinct values.
+   and divide by the row count of the larger relation (this value does get
+   scaled in the non-unique case).
    The number of rows
    that the join is likely to emit is calculated as the cardinality of the
    Cartesian product of the two inputs, multiplied by the
-- 
2.30.2