Skip to content
This repository was archived by the owner on May 19, 2020. It is now read-only.

Migrate to a well known PostGIS docker base #13

Closed
wants to merge 5 commits into from

Conversation

nyurik
Copy link
Member

@nyurik nyurik commented Mar 19, 2019

Continued excellent work by @smellman.
Now the container is based on mdillon/postgis:11, and only adds the osml10n extension.

Also a minor version bump to osml10n 2.5.5

Note This PR has already been published on Docker Hub under sophox/postgis (:latest is PG 11, and also :9.6) -- feel free to use it directly from there.

Removed most of the custom building,
leaving only the osml10n extension.

Bump version of osml10n to 2.5.5
@nyurik
Copy link
Member Author

nyurik commented Mar 22, 2019

Pinging @stirringhalo @keosak @jirik How do you manage dockerhub tag versioning? This PR clearly needs to use a new version, while keeping older docker containers in place. Thanks!!!

@smellman
Copy link

@nyurik Can you re-create docker image and push to docker-hub?
Yesterday, mdillon/postgis:11 upgrade to postgis 2.5.2.
Also I checked the slow problem with postgresql 10 in my branch, so I want to research postgis 2.5.2 or postgresql 9.6.
Now I join into slack

@nyurik
Copy link
Member Author

nyurik commented Mar 22, 2019

@smellman thx for the update - my docker hub settings were incorrect - now i fixed it, it should auto-build whenever the base changes.

@nyurik nyurik closed this Mar 22, 2019
@nyurik nyurik deleted the base branch March 22, 2019 17:00
@nyurik nyurik restored the base branch March 22, 2019 21:13
@nyurik nyurik reopened this Mar 22, 2019
@nyurik
Copy link
Member Author

nyurik commented Mar 22, 2019

Heads up: Per @Komzpa, PostGIS 2.5 might be slower with MVT geometries than the dev version that is currently in master, but at the same time Darafei said that 2.5 fixes incorrect MVT geometry handling (looses peaces of geometries). The same issue also applies to the official PostGIS 2.4). PostGIS 3.0 should fix that... when its released in half a year (?) -- 3.0 allows PostGIS to use wagyu instead of geos, making it far faster. In the mean time, it might be faster to just use geos master. See also PostGIS install reqs

@smellman
Copy link

@nyurik I tested PostGIS svn-trunk with wagyu and it still slow in openmaptiles.

@Komzpa
Copy link

Komzpa commented Mar 26, 2019

Is the source of slowness identified?

@smellman
Copy link

@Komzpa Please check my result.
Now I am researching what version raise this issue.

@smellman
Copy link

Also, It is not PostgreSQL issue.
I tested with PostgreSQL 9.6, 10, 11 and PostGIS 2.5.1, 2.4.7, svn-trunk.
I just get this slowness problem with running curl http://localhost:8090/tiles/0/0/0.pbf 6 times so I said it is 0/0/0.pbf problem.

@Komzpa
Copy link

Komzpa commented Mar 26, 2019

running a PREPARE'd query first five times it uses normal planning and then switches to generic plan if costs say it's not worth planning better. Tile 0/0/0 performs seq scan of everything so it caches that?

https://www.postgresql.org/docs/9.6/sql-prepare.html

Prepared statements can use generic plans rather than re-planning with each set of supplied EXECUTE values. This occurs immediately for prepared statements with no parameters; otherwise it occurs only after five or more executions produce plans whose estimated cost average (including planning overhead) is more expensive than the generic plan cost estimate. Once a generic plan is chosen, it is used for the remaining lifetime of the prepared statement. Using EXECUTE values which are rare in columns with many duplicates can generate custom plans that are so much cheaper than the generic plan, even after adding planning overhead, that the generic plan might never be used.

@nyurik
Copy link
Member Author

nyurik commented Mar 26, 2019

I wonder if this could be a case of over-optimization -- it doesn't matter how fast a single 0/0/0 tile generates. It is more important to have average generation to be as good or better than before, and we need to do some QA for that.

@smellman
Copy link

I tested with PostgreSQL 9.6 + PostGIS 2.4.0, I get slowness issue.
So the over-optimization problem happens between 2.4.0dev(rev: ff0a844e606622f45841fc25221bbaa136ed1001) and 2.4.0 stable release.

@smellman
Copy link

@nyurik @Komzpa
I just found first bad commit using git bisect: https://gist.github.com/smellman/b4e1b223a3e8a08f1519b502c7bdca31
postgis/postgis@296871a
I just make revert commit and I will test it.
If you know about first bad commit, please tell me.

@smellman
Copy link

My previous commit run well, but I make other commit by my friend advice.

@smellman
Copy link

hmm, my last commit was bad.

@Komzpa
Copy link

Komzpa commented Mar 31, 2019

@smellman can you file a ticket on PostGIS trac with these findings?

@smellman
Copy link

smellman commented Apr 1, 2019

@Komzpa https://trac.osgeo.org/postgis/ticket/4362#ticket I just write a ticket. I tried to write English:-)

@pramsey
Copy link

pramsey commented Apr 12, 2019

I put in a small change, I don't necessarily expect it to alter this case, I'm still wrapping my head around what's changing the plan. Clearly the index scans are not a great idea, but the selectivity estimate must be low enough for the planner to thing they are OK. It's such a huge oddball of a query though, I'm loath to muck with things much without understanding where the selectivity estimate is going wrong relative to the data.

For each table, can you compare the estimate (select _postgis_selectivity(tablename, geomcolumnname, bbox, 2)) to the actual ratio of number of features returned

select 
  count(*) filter (where geomcolumnname && bbox) / count(*) as selectivity,
  count(*) as count,
  count(*) filter (where geomcolumnname && bbox) as count_filtered 
from tablename

to see where we have particularly bad estimates?

@smellman
Copy link

openmaptiles=# select
openmaptiles-#   count(*) filter (where geometry && ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244))) / count(*) as selectivity,
openmaptiles-#   count(*) as count,
openmaptiles-#   count(*) filter (where geometry && ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244))) as count_filtered
openmaptiles-# from layer_waterway(ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244)), z(559082264.0287178));
ERROR:  division by zero

Function can't calculate.

openmaptiles=# select
openmaptiles-#   count(*) filter (where geometry && ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244))) / count(*) as selectivity,
openmaptiles-#   count(*) as count,
openmaptiles-#   count(*) filter (where geometry && ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244))) as count_filtered
openmaptiles-# from water_z0;
 selectivity | count | count_filtered
-------------+-------+----------------
           1 |    27 |             27
(1 row)
openmaptiles=# select
  count(*) filter (where geometry && ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244))) / count(*) as selectivity,
  count(*) as count,
  count(*) filter (where geometry && ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244))) as count_filtered
from ne_110m_ocean;
 selectivity | count | count_filtered
-------------+-------+----------------
           1 |     2 |              2
(1 row)
openmaptiles=# select
  count(*) filter (where geometry && ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244))) / count(*) as selectivity,
  count(*) as count,
  count(*) filter (where geometry && ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244))) as count_filtered
from ne_110m_lakes;
 selectivity | count | count_filtered
-------------+-------+----------------
           1 |    25 |             25
(1 row)

View and table can calculate but

openmaptiles=# select _postgis_selectivity('water_z0', geometry, ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244)), 2);
ERROR:  column "geometry" does not exist
LINE 1: select _postgis_selectivity('water_z0', geometry, ST_MakeBox...
                                                ^
openmaptiles=# select _postgis_selectivity('public.ne_110m_ocean', geometry, ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244)), 2);
ERROR:  column "geometry" does not exist
LINE 1: ...lect _postgis_selectivity('public.ne_110m_ocean', geometry, ...
                                                             ^

_postgis_selectivity raise error in view and table.

@smellman
Copy link

@pramsey I can run _postgis_selectivity without last 2, is it OK?

openmaptiles=# select _postgis_selectivity('public.ne_110m_ocean', 'geometry', ST_MakeBox2D(ST_Point(-20037508.342789244, -20037508.342789255), ST_Point(20037508.342789244, 20037508.342789244)));
 _postgis_selectivity
----------------------
    0.980297923088074
(1 row)

@nyurik
Copy link
Member Author

nyurik commented Aug 3, 2019

@MartinMikita this PR has been pending for a while, and should allow us to move forward, especially considering that very soon PostGIS plans to release v3.0, which will include greatly improved ST_AsMVT tile generation performance. If someone is using the current docker build, they can continue using via specific version tag.

@klokan
Copy link
Member

klokan commented Aug 15, 2019

Kind reminder for @MartinMikita

@MartinMikita
Copy link

This PR will be validated on our side before merging into the code.

@nyurik
Copy link
Member Author

nyurik commented May 17, 2020

This repository has been moved to openmaptiles-tools as docker/postgis directory.

See also openmaptiles/openmaptiles-tools#252

@nyurik nyurik closed this May 17, 2020
@nyurik nyurik deleted the base branch May 17, 2020 03:41
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants