OpenStreetMap Foundation, Operations Meeting* - Agenda & Minutes
Friday May 22nd 2020, 18:00 London time
Location: Video room at https://osmvideo.cloud68.co
* Please note that this was not strictly an OWG meeting.
- Grant Slater (OWG)
- Tom Hughes (OWG)
- Paul Norman (OWG, board)
- Emilie Laffray (OWG - joined ~ 20' after start)
- Allan Mustard (board)
- Guillaume Rischard (board - joined for a few minutes)
- Michal Migurski
Minutes by Dorothea.
Action items from this meeting
- Paul to update the Github ticket "Adding API key support for tile.osm.org" [Topic: Adding API key support for tile.osm.org]
- OPS team: draft an email (regarding a call for proposals), ask for comments. [Topic: Adding API key support for tile.osm.org]
- Paul to draft an email about a "Turning off Trac" announcement. [Topic:Turn off Trac]
- Paul to look at first steps for getting some front-end servers in UCL/Slough. [Topic: Addition of servers to UCL/Slough]
- Tom to look at removing back-ends completely from infrastructure. [Topic: Consider dropping split frontend/backend configuration for web site]
- Paul to update minimum specs for new caches to 12 or 16 G RAM, SSD back storage. [Topic: Changing specs for new caches]
On 2020-04-10 action item: Push up tile usage policy
- Enforcement of the policy might shift the numbers and that relies on technical changes, like better tracking/blocking (audit block, similar to what Nominatim has).
On 2020-04-10 topic: New Data Centre Space? (Pending question: Can we find expertise to run it?
- Reasonable measure to have a 2nd data centre in west Europe.
- Worry about Bytemark - as bought up and willingness to host us may have waned.
- Paul got quotes at time of move to Equinix.
Emilie joined ~ 20' after start
Adding API key support for tile.osm.org?
No reasonable way of implementing.
- Success of implementation from a technological point of view.
- That implementation might increase support load.
- Distribution of keys
On whether hiring a junior sysadmin would help
junior not in sense of experience, but junior to Grant and Tom
- alternative proposal
Need design review, even for senior person.
Could ask for proposals or contract someone to work out a proposal of what a system like this would look like.
- Paul to update the Github ticket "Adding API key support for tile.osm.org"
- OPS team: draft an email (regarding a call for proposals), ask for comments.
Deprecate Turn off Trac
Open tickets: 40 for website. Andy in working on them.
Banner for about a year saying that it's deprecated.
- Announce that we're turning off Trac and have deprecated SVN.
- Provide timeline for making service read only.
- Provide time for turning the service off.
Action item: Paul to draft an email about the announcement.
Addition of servers to UCL/Slough
- 28 cores, 192 G of memory
- Bought for caching purposes and potentially rendering later.
- Allocated fixed number of Ethernet ports.
- Getting at least one or two frontend and backend servers.
- Purchasing a second switch.
- Specs: Close to what we've got in the US.
- Paul to look at first steps for getting some front-end servers in UCL/Slough.
Consider dropping split frontend/backend configuration for web site
https://github.com/openstreetmap/operations/issues/424 (issue created post-meeting)
- CPU and RAM has scaled faster than our load curve.
- Removes the requirement of having upgrades.
- Reduces power demand to put in some better hardware elsewhere if we needed it.
- Tom to look at removing back-ends completely from infrastructure.
Not to be publicly minuted.
Changing specs for new caches
- Largest cache 128 G RAM plus split base the logs.
- Few: 80 G RAM
- Recent problems in VMs with lower specs.
- Minimum 12 G RAM.
- SSD back storage.
Exceptions: caches in locations where not a load of traffic is expected.
- Paul to update minimum specs for new caches to 12 or 16 G RAM, SSD back storage.
Michal's report into archiving help.openstreetmap.org (sent via email prior to the meeting)
Related to action item: 2020-05-22 Michal to look into archiving help (priority) and trac - Source
Michal's results with static files and simple modifications to each page to indicate its archived status. Since the site is largely built on simple HTML it’s compatible with a static archiving approach. I’ve demonstrated one using AWS S3.
OSQA is a Django 1.6 application written in Python with a Postgres database. We self-host the site and configure it from a Chef cookbook. The site currently contains 75k questions and over 8,000 users dating back a full decade. Tom Hughes provided me with a snapshot of the site database and a partial server access log which allowed me to determine the overall scope of the content and which parts are popular.
I tried three approaches to archiving.
First, I ran wget against the public site doing a simple web-crawl similar to how a search engine might interact with it. Wget has the ability to mirror content locally, but I found that it took too long to be practical due to a large number of filtering, sorting, and pagination URL parameters which inflated the number of redundant requests.
Second, I tried installed OSQA directly to interact with Django’s administrative tools using a working site. OSQA’s version of Django is as old as the site, and setup instructions were somewhat outdated. I next attempted to use OSM’s OSQA Chef cookbook to do the same thing, but got stuck early and decided on a different approach.
Last, I used the access logs to generate a list of top URLs on the help site and added to that a list of all users, tags, questions, and badges to completely cover all valuable site content while stripping redundant parameters for filtering, sorting, and pagination.
I’ve prepared this static S3 bucket with a selection of pages suitable for archiving:http://help.osm.org-static-archive.s3-website-us-east-1.amazonaws.com/
To make these pages work as a historical archive, I made several changes to them:
- Removed interaction interface elements for logged-in users like vote buttons, favorites, forms for asking and answering questions, and the login page
- Removed interface elements depending on a dynamic web backed such as the search form and sorting, filtering, and pagination links, and subscription features like RSS
- Added a prominent date-of-archiving note to the top of each page
The result is pretty clean. Here are some sample pages showing how questions, tags, users, and badges have been archived. Click around, but keep in mind that not every link will work since the archive was based on a log from one week ago:
This is the Python script I use to mirror site pages and modify HTML:https://gist.github.com/migurski/0ac8c84a462e95a2a119831d833e5251
Recurring date and time of meeting
Dorothea to create poll for recurring date/time