Operations/Minutes/2021-05-19

From OpenStreetMap Foundation
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

OpenStreetMap Foundation, Operations Meeting* - Agenda & draft minutes
Wednesday 19 May 2021 18:00 London time
Location: Video room at https://osmvideo.cloud68.co

* Please note that this was not strictly an OWG meeting.

Participants

Present:

Minutes by Dorothea Kazazi.

Apologies:

Administrative

Previous minutes

2021-05-05

Action items

  • 2021-05-19 Grant to give Twitter credentials to Paul (was: 2021-05-05 Grant to check/fix GroupTweet for osm_tech Twitter account)
  • 2021-05-05 Grant to email Toby from WMF and suggest chattting to MapTiler. [Topic: Wikimedia]
  • 2021-05-05 Paul to do a circular for ISP in a couple of days. [Topic: Dublin updates - Network requirements change]
  • 2021-05-05 Grant to provide switches model and vendor to Paul. [Topic: Dublin updates - Someone to handle network purchasing]
  • 2021-04-21 Paul to work out where we need the new HP DL360 servers. [Topic: New HP DL360 servers] # 2021-05-19 on the agenda.
  • 2021-04-21 Paul to tweet asking for recommendation of HP resellers in Ireland. [Topic: New HP DL360 servers] # 2021-05-19 will tweet once he gets the info.
  • 2021-04-21 Paul to check how raid1 with hot spare works out with the budget. [Topic: New Rendering server]
  • 2021-04-07 Grant to get contact info of ISP to Guillaume [Topic: Improving networking in AMS]. # 2021-04-21 couldn't find it. He'll try to figure out who's available in the new data center in AMS. # 2021-05-19 decision to be removed
  • 2021-03-24 Paul to create ticket related to API PostgreSQL update [Topic: API PostgreSQL update]
  • 2021-03-24 Hrvoje to check power supplies on Viserion/Drogon [Topic: Old tile caches: Viserion and Drogon] # 2021-05-19 decision to be removed.
  • 2021-03-10 Paul to have a look at TimescaleDB. [Topic:TimescaleDB] # 2021-05-19 decision to be removed as talking to TimescaleDB community, which has more expertise.
  • 2021-02-24 Tom to report back on TimescaleDB again at next meeting. [Topic: Reportage] [was: 2021-01-13 Tom to evaluate TimescaleDB] [Topic: Longer term metric retention] # 2021-04-21 SSD Disk Failing in US # 2021-05-19 decision to leave on the agenda.
  • 2021-02-24 OWG --> Grant to install a Discourse instance to get us started. [Topic: Discourse] # 2021-04-21 on the agenda. # 2021-05-19 pending.
  • 2021-01-13 OWG to send message to the servers we want to keep. [Reportage. Existing CDN servers] # 2021-03-24 Three servers stopped talking to us (shenron, naga and one more) # 2021-05-19 pending.
  • 2021-01-13 Grant to wipe thorns and the 3 other machines [AMS] [Topic: Longer term metric retention] # 2021-05-19 pending.
  • 2021-01-13 Paul to create ticket with Equinix to scrap the wiped thorns and the other 3 machines [Topic: Longer term metric retention]
  • 2021-01-13 Paul to create a ticket related to tile geographical localisation. [Topic: Lack of render capacity] # 2021-05-19 Done.
  • 2020-12-02 Grant to develop some thoughts on what is next for us using AWS. [Topic: AWS] # 2021-05-19 pending.
  • 2020-11-04 Grant to do heavy integrity checks to katla to test its response to heavy load. # 2021-03-24 Grant has got some disks to replace. Needs to open ticket with Bytemark. # 2021-05-19 Done.
  • 2020-11-04 OWG to work out tile log archival and deletion policy at later stage. [Topic: Commercial CDN] # 2021-03-24 & 2021-05-19 deferred to future point.
  • 2020-10-21 Paul to write to Discourse ticket and email the board [Topic: Discourse]
  • 2020-09-23 Grant to put in touch Guillaume and Toby. [Topic: Wikimedia challenges with Tile CDN delivery] Grant to check up on status. # 2021-05-19 superseded by email to be written.
  • 2020-09-23 OWG to pencil out what is needed. [Topic: Wikimedia challenges with Tile CDN delivery] # 2021-05-19 superseded by email to be written.
  • 2020-09-23 Toby Negrin (Wikimedia) to ask Wikimedia whether they would be interested in OSMF running a tile service available to Wikipedia and if they would be willing to share hardware resources or expertise. [Topic: Wikimedia challenges with Tile CDN delivery] # 2021-05-19 superseded by email to be written.
  • 2020-09-09 Tom to update OAuth ticket https://github.com/openstreetmap/openstreetmap-website/issues/1408 [2020-09-09 Reportage, related to 2020-08-26 action item] # 2021-05-19 Done.
  • 2020-09-09 Grant [Topic: AWS] Speak to AWS person about going ahead with open data program with official OSM S3 bucket. # 2021-05-19 pending.
  • 2020-09-09 [Not assigned] [Topic: AWS] Decide on services we need to run on AWS. Need clearance. # 2021-05-19 overlap with future AWS usage - decision to have a single ticket.
  • 2020-09-09 [Not assigned] [Topic: AWS] Work out rough budget. # 2021-05-19 decision to remove as budget will be worked out once decided what to run.
  • 2020-09-09 Grant [Topic: AWS] Talk to OpenAerial Map/HOT. # 2021-05-19 pending.
  • 2020-09-09 [Not assigned] [Topic: Federating OSM communities' rooms through OSMF-hosted Matrix servers] Evaluate effort required. Constrain the scope to what we can support and perhaps ask volunteers to step in. # 2021-05-19 decision to remove. Stick with Discourse for the time being.
  • 2020-09-09 [Topic: Ironbelly replacement] Paul to work out a proposal for the ironbelly replacement. # 2021-05-19 on agenda.
  • 2020-08-26 Tom to look at road ahead for OAuth. [Topic: Merge forums, OSQA, MLs to discourse?] https://github.com/openstreetmap/openstreetmap-website/issues/1408 # 2020-09-09 Did some investigation - branch with some code. Better understanding of OAuth 2 and options. Doable. # 2021-05-19 decision to remove as superceded by more recent action items.
  • 2020-08-26 Grant to talk to Ianabout migrating old content to Discourse. [Topic: Merge forums, OSQA, MLs to discourse?] # 2020-09-09 pending. # 2021-05-19 Paul has stricken this through.
  • 2020-08-26 [Not assigned] Create Github ticket for updated OAuth. [Topic: Merge forums, OSQA, MLs to discourse?]
  • 2020-08-12 Michal to try to rekindle excitement about people helping with imagery (on dev channel/imagery channel or Slack). # 2020-08-26 No progress.
  • 2020-07-29 Grant to enable background sync to Amazon Web Services (AWS) S3. [Topic: Ironbelly] # 2020-08-12&26 Manually run, automated scripting to be added. # 2021-05-19 Grant to run the script again.
  • 2020-07-29 Grant to check with Wiki Admins on hCaptcha (reCaptcha replacement). [Topic: Wiki reCaptcha issue] https://github.com/openstreetmap/operations/issues/454 # 2020-08-12 hCaptcha people reached out and happy to help. Blocker on Mediawiki 1.35 being released in August. # 2021-05-19 blocker removed.
  • 2020-07-15 Paul and Grant to quote up a server to replace errol/kessie. [Topic: Replacement of Errol/Kessie]. # 2020-08-12 A new person in OWG asked to do Errol. Need to replace it at some point - at University College London. # 2021-05-19 pending.
  • 2020-07-15 Ian to try converting fluxBB DB to go into Discourse. [Topic: OSM Forum (FluxBB) update]. # Evaluating whether moving is an option. Need to see about history, user log-in. # 2021-05-19 decision to leave the action item open.
  • 2020-07-01 Paul to create a ticket about solutions to reduce incoming comms. [Topic: Revision of acceptable use policy to reduce incoming comms] # 2021-05-19 decision to leave the action item open.
  • 2020-07-01 Grant to work out some of the questions for an online form as a solution to reduce incoming comms. [Topic: Revision of acceptable use policy to reduce incoming comms] # 2020-08-12 need to think about the reply # 2021-05-19 decision to leave the action item open.
  • 2020-07-01 Michal to reach to Amazon Web Services (AWS) (need a story for AWS to show how their help will lead to AWS spending from users). [Topic: Commercial CDN for Bulk Tile Users] https://lists.openstreetmap.org/pipermail/talk/2020-May/084700.html # 2020-08-12 Michal feels blocked, could draft something. We got contacted by AWS, not replied yet. More info at 2020-08-12 reportage. # 2021-05-19 decision to remove.
  • 2020-06-04 Paul to update the Github ticket "Adding API key support for tile.osm.org" https://github.com/openstreetmap/operations/issues/342
  • 2020-06-04 OPS team: draft an email (regarding a call for proposals), ask for comments. [Topic:Adding API key support for tile.osm.org https://github.com/openstreetmap/operations/issues/342] # 2021-05-19 decision to remove.
  • 2020-04-10 OWG to push up tile usage policy (commercial entities, vehicle tracking applications - which are heavy on Nominatim and probably not attributing as well) [Topic: Commercial CDN for Bulk Tile Users] # 2021-05-19 decision to remove.
  • 2020-04-10 Grant to work out a table of different data bits, work out how they are backed up and what can be potentially improved. [Topic: High Availability / Redundancy of OpenStreetMap.org (and primary services)] # 2021-05-19 decision to leave the action item open.
  • 2020-04-10 [Not assigned]: Potentially move some more of backup data into long term S3 buckets. [Topic: High Availability / Redundancy of OpenStreetMap.org (and primary service)] # 2021-05-19 decision to remove.

Reportage

Netlify related request

Request for updated permissions - to be granted.

From action item updates

Replacing reCaptcha with hCaptcha.

  • hCaptcha better supported now in Mediawiki.
  • Bugs of reCaptcha and simple editor.

HP DL360 Gen9 servers for Dublin

Paul came up with different numbers of servers required when budgeting and now: 7 then and 10 now. (https://github.com/openstreetmap/operations/issues/525)

Improved locality of backend tile requests

https://github.com/openstreetmap/operations/issues/527
Europe tile requests are being split based on metatile coordinate into server groups of one powerful server + one weak server. Significant reduction in rendering workload.

Will this cause problems with failover?

Decision: Test 10 minute stopping of apache on odin or ysera at 22:00 or 23:00 UK time.

Planet servers

We need to decide the plan for the planet servers and if they're going to need a large (>30TB) RAID array or we will use object store. What needs to happen?

3 things that need storage

  • planet server
    • has backups, probably a significant portion of that
    • backups should be moved to AWS
    • Grant to provide breakdown of usage of planet server storage space
    • Paul to look at new server with enough storage to replace ironbelly
  • dev server
  • imagery server

Action: Grant to provide breakdown of planet server files.

Breakdown of planet server files (incomplete run) 3.4 TB Backups
400 G log files
300 G Current run of a planet dump
RAILS storage (old user images and gpx files)

AWS

  • Planet serving portion of S3 might be provided for free (potentially replication and services we need to run the S3 bucket). Wouldn't run tile services.
  • Concern: development time.

Decision: 2U machine. Suggestion: 32TB + 25%. Will depend on run.

Future: add extra disks to slots. No concern for unmatched disks.

QGIS

Topic added after request of Sarah Hoffmann (Nominatim).

Deferred until we can talk to Sarah.

10Gb Switches

£3500 each minimum

Action: Grant to price up options for review and decision.

RAM for new DB server in Dublin

DB server: 11.04 TB used.

Decision: 0.5 TB RAM

Dublin tickets

https://github.com/openstreetmap/operations/milestone/5

Suggestion: split ticket "cabling and accessories" https://github.com/openstreetmap/operations/issues/529

Open Ops Tickets

Review open, what needs policy and what needs someone to help with.
https://github.com/openstreetmap/operations/issues

Action items from this meeting

  • Grant to give Twitter credentials to Paul. [Action item updates]
  • Grant to provide breakdown of planet server files. [Topic: Planet servers]
  • Grant to price up 10Gb Switches options for review and decision. [Topic: 10Gb Switches]

Next meeting

Wednesday 2 June 2021 18:00 London time

Operations meetings are currently being held every 2 Wednesdays, at 18:00 London time.
Online calendar showing the OPS meetings.