Working Group Minutes/EWG 2013-05-20
|IRC nick||Real name|
- rails_port README
- gravitystorm has made progress on the READMEs, and had a few items for discussion by the group.
- ACTION: gravitystorm create platform-specific install documentation in INSTALL.$platform.md files
- ACTION: gravitystorm change the databases in example.config.yml (and the docs) to be osm-development/osm-test/osm-production
- Carto benchmarking
- pnorman has done some benchmarking and found a slowdown between 12% and 22% .
- It was generally considered that the slowdown we would experience would be somewhere between these, and not so bad that it couldn't be handled with some hardware mitigations.
17:03:30 <zere> minutes of the last meeting: http://www.osmfoundation.org/wiki/Working_Group_Minutes/EWG_2013-05-13 17:04:42 <zere> gravitystorm: any news on the READMEs? 17:05:43 <gravitystorm> zere: nope, still in progress. A question for the group though - how best to deal with platform-specific notes? I know the ones on the wiki are all hopelessly outdated, but I haven't nailed a strategy for replacements. Options: 17:06:27 <gravitystorm> 1) All crammed into INSTALL.md 2) Extra pages on the openstreetmap-website/wiki on github 3) extra pages on the osm.org wiki 4) INSTALL.windows.md etc 17:06:31 <gravitystorm> thoughts? 17:07:28 <zere> i would prefer the INSTALL.$platform.md approach, but having them in a .gh-pages branch would also be good. 17:07:52 <zere> i suppose the main idea would be that they're well-written, and not overflowing with outdated information like the wiki is 17:08:17 <apmon> If we go the route of having the main doc in INSTALL.md, I think I'd vote for INSTALL.$platform.md 17:09:25 <gravitystorm> OK, unless anyone suggests otherwise, that's what I'll go for 17:10:10 <zere> awesome. before the meeting started, pnorman reported: "best guess on osm.xml vs osm-carto is 10%-20% slower for carto. 10% on in-memory working set, 20% if it has to hit the disk. other changes like upgrading postgres/postgis may speed up things." 17:10:51 <gravitystorm> Second question, fall under the "worth the effort?" category - I'd like to rename the databases. Currently they are osm/osm-test/openstreetmap, and I think that can lead to confusion for newbies. I propose osm-development/osm-test/osm-production and amending the docs accordingly 17:11:22 <RichardF> +1. that confused me a bit when setting up the rails_port last week. 17:11:35 <RichardF> (do most people need to set up anything apart from development anyway?) 17:12:04 <gravitystorm> RichardF: few people need production, but development + test is common 17:12:41 <zere> if you're doing development... you should have test as well! 17:12:54 <zere> s/have/want/, rather. 17:13:47 <gravitystorm> Does anyone know if there is other documentation that would be impacted by this change, e.g. switch2osm or something similar? 17:15:03 <zere> i don't think so... isn't switch2osm more about the tiles? 17:15:46 <RichardF> yep 17:16:09 <apmon> zere: Wasn't it the other way round? 20% in-memory and 10% if it hits disk? 17:16:47 <gravitystorm> OK, I'm done on rails-port docs 17:17:16 <gravitystorm> #action gravitystorm create platform-specific install documentation in INSTALL.$platform.md files 17:17:30 <RichardF> gravitystorm: yell if you need any help on OS X stuff 17:17:32 <zere> apmon: that wasn't what pnorman said 45 mins ago. 17:17:54 <apmon> Can you check with him again on that then? 17:18:07 <gravitystorm> #action gravitystorm change the databases in example.config.yml (and the docs) to be osm-development/osm-test/osm-production 17:18:58 <zere> apmon: yup. hopefully he'll be back before the end of the meeting and can explain the results. 17:19:08 <gravitystorm> apmon: you're correct on this, going by his original mailing list post 17:19:45 <gravitystorm> http://lists.openstreetmap.org/pipermail/tile-serving/2013-May/000217.html 17:19:51 <apmon> Yes, all of the previous discussions was that way round, so I guess he might have just said it wrong in the last discussion with zere 17:20:04 <gravitystorm> For the from-ram [...] a decrease of 22%. 17:20:28 <gravitystorm> For the larger [from-disk] set [...] a decrease of 12%. 17:21:10 <apmon> So one question is, when moving from EC2 disks to SSDs on yevaud / orm, will it behave more like in memory or like from-disk 17:22:12 <gravitystorm> Well, perhaps we shouldn't guess too much, and say that it's likely to be between a 12-22% slowdown. What consequences are there for that? 17:22:40 <apmon> Not too many 17:24:45 <zere> splitting the difference - 17% slowdown... probably not a massive problem. does mean that the queues might go up a bit. 17:25:07 <apmon> On yevaud, we'd likely hit the queue full situation more often, but it should be fine most of the time. On Orm, the faster server should compensate for the carto slowdown 17:25:42 <zere> if the from-ram is a 22% slowdown, does that mean that the carto is making mapnik do 22% more work? is that the right conclusion to draw? 17:27:06 <zere> (well, to be precise 27% more work) 17:27:28 <apmon> It is possible that there are some differences in postgresql as well 17:28:28 <apmon> It is possible that mapnik hits postgresql with the same queries multiple times (as I think layer caching wasn't turned on) 17:28:55 <apmon> The consequent times queries comeing from ram. 17:30:38 <gravitystorm> yes, as the style has been developed, the SQL queries are diverging between the two styles 17:31:41 <gravitystorm> I don't think there's anything to do (or much to discuss) until specific actions are identified as per a+b in http://lists.openstreetmap.org/pipermail/tile-serving/2013-May/000232.html 17:32:23 <gravitystorm> But the overarching point is - do we *need* to improve performance? Or is it just a nice-to-have? 17:33:20 <zere> somewhere in between the two :-) 17:34:30 <gravitystorm> :-) 17:34:41 <zere> on the one hand, no - we don't *need* to improve performance. we had a discussion last week about various ways of using multiple servers for rendering which would eliminate that need. but they're all ideas at this point - not much working code. 17:35:38 <zere> on the other hand, if performance improvements were made, it would extend the life and the capabilities of the existing servers. and possibly lead to the ability to do more resource-intensive cartography. 17:35:46 <Firefishy> orm can now be safely pulled from the tile caches. I need to do final spec and purchasing of SSD for orm. 17:36:53 <Firefishy> I have just pulled orm from tile caches. 17:37:01 <TomH> and we need to fix the style to download coastline data from somewhere that isn't tile.osm... 17:37:03 <gravitystorm> zere: OK, but I think what you're saying is that without some kind of change, then we *can't* put openstreetmap-carto onto the tileserver? 17:37:28 <zere> no, i think pnorman's benchmarking shows that the slowdown won't be too bad. 17:37:35 <gravitystorm> TomH: OK, I can wget it onto a different server :-) 17:38:23 <zere> but if there are some things (e.g: z13) which are clearly slow then it looks like that would be something worth looking into. 17:38:49 <Firefishy> We can re-point parent cache within minutes and per tile cache... so we can test load on orm per tile cache region and during specific periods. 17:38:57 <gravitystorm> zere: OK. I just want to clarify what's blocking deployment, as opposed to what osm-ewg thinks is worth working on 17:39:46 <Firefishy> So for example, I can move Australia to orm for a 24 hour testing period and then keep adding regions. 17:41:25 <zere> gravitystorm: yeah. unless i'm proved horribly wrong by spiralling queues - i think a 15%-ish slowdown is a nuisance not a disaster. 17:41:44 <Firefishy> We would need to slowly migrate the tile traffic anyway, so that orm is able to build up a base of tiles. 2 or 3 weeks at a guess. 17:44:20 <zere> worth pointing out that, from pnorman's results, z13 is 33% of the total render time. yevaud's statistics put z13+z14 at 29% of the total render time. 17:45:26 <zere> might just be an artefact of the benchmarking sample, but it would seem that a 10-20% speedup could be reclaimed from z13. 17:45:36 <apmon> So that matches reasonably well 17:46:46 <apmon> Are there any more statistics one could build into mod_tile / renderd to check things are running smoothly once in production? 17:47:20 <apmon> zere: Between, renderd should spit out the detailed statistics per zoom level, only munin bunches them up into groups of zoom level 17:48:44 <Firefishy> The SSD IOPs rate should increase from yevaud to orm: 39500 IOPs (Intel 320) -> 100000 IOPs (Samsung 840 Pro) 17:49:05 <Firefishy> 270 MB/s -> 540 MB/s 17:49:35 <zere> i wonder whether it's possible to have some tool to just look at the quantity of data that each layer's query pulls in from the database across a bunch of zoom levels? 17:50:38 <zere> presumably it's easy enough to pull the queries out of the carto mml file. is it easy to figure out which queries are used at which zoom levels? 17:51:02 <apmon> woodpeck had a tool like that a while ago 17:51:52 <apmon> https://github.com/openstreetmap/mapnik-stylesheets/blob/master/utils/stylecheck.pl 17:51:59 <apmon> Not sure if it still works 17:55:00 <apmon> Looks like it does still work on the default osm.xml 17:55:53 <apmon> http://pastebin.com/V3hGU9Mu is the result for the default osm.xml 18:00:07 <zere> quite a lot of stuff kicks in at 13, according to that... 18:01:11 <zere> i guess the next step might be to execute those queries, or at least find out how many features they return. 18:01:20 <zere> something for next week perhaps. 18:01:27 <zere> thanks to everyone for coming!