Operations/Minutes/2021-04-07

From OpenStreetMap Foundation

OpenStreetMap Foundation, Operations Meeting* - Agenda & draft minutes
Wednesday 7 April 2021 18:00 London time
Location: Video room at https://osmvideo.cloud68.co

* Please note that this was not strictly an OWG meeting.

Participants

Present:

Minutes by Dorothea Kazazi.

Apologies:

Administrative

Previous minutes

2021-03-24

Action items

  • 2021-03-24 Paul to gather some options regarding the new data centre. [Topic: New data centre]
  • 2021-03-24 Tom to provide trace from Hertzner related to IPv6 connectivity issues to Grant. [Topic: IPV6]
  • 2021-03-24 Grant/Paul to report the connectivity issue to Cogent. [Topic: IPV6]
  • 2021-03-24 Paul to create ticket related to API PostgreSQL update. [Topic: API PostgreSQL update]
  • 2021-03-24 Grant to look at the cost of having as many CI runners as wanted. Related: split AWS account, so that CI does not run on master. [Topic: CI]
  • 2021-03-24 Paul to create a ticket about OTRS. [Topic: OTRS]
  • 2021-03-24 Hrvoje to check power supplies on Viserion/Drogon. [Topic: Old tile caches: Viserion and Drogon]
  • 2021-03-10 Tom to fix the Wordpress updater. [Topic: Wordpress updates] #2021-03-24 update: No action now. [Reportage]
  • 2021-03-10 Paul to have a look at TimescaleDB. [Topic:TimescaleDB]
  • 2021-02-24 Tom to report back on TimescaleDB again at next meeting. [Topic: Reportage] [was: 2021-01-13 Tom to evaluate TimescaleDB] [Topic: Longer term metric retention]
  • 2021-02-24 OWG--> Grant to install a Discourse instance to get us started. [Topic: Discourse]
  • 2021-02-24 OWG to get a new DB server for Dublin - pending board budget "level" approval (already included in "High" option). [Topic: Katla]
  • 2021-02-24 OWG--> Grant to check with fastly if they are happy to be credited as a top 3 donor. [Topic: Tile CDN]
  • 2021-02-10 Paul to gather details about data centers near Dublin.
  • 2021-01-13 OWG to send message to the servers we want to keep. [Reportage. Existing CDN servers] #2021-03-24 Three servers stopped talking to us (shenron, naga and one more)
  • 2021-01-13 Grant to wipe thorns and the 3 other machines [AMS]. [Topic: Longer term metric retention]
  • 2021-01-13 Paul to create ticket with Equinix to scrap the wiped thorns and the other 3 machines. [Topic: Longer term metric retention]
  • 2021-01-13 Paul to create a ticket related to tile geographical localisation. [Topic: Lack of render capacity]
  • 2020-12-02 Grant to develop some thoughts on what is next for us using AWS. [Topic: AWS]
  • 2020-11-04 Grant to do heavy integrity checks to Katla to test its response to heavy load. #2021-03-24 Grant has got some disks to replace. Needs to open ticket with Bytemark.
  • 2020-11-04 OWG to work out tile log archival and deletion policy at later stage. [Topic: Commercial CDN] #2021-03-24 deferred to future point.
  • 2020-10-21 Paul to write to Discourse ticket and email the board. [Topic: Discourse]
  • 2020-09-23 Grant to put in touch Guillaume and Toby. [Topic: Wikimedia challenges with Tile CDN delivery] Grant to check up on status.
  • 2020-09-23 OWG to pencil out what is needed. [Topic: Wikimedia challenges with Tile CDN delivery]
  • 2020-09-23 Toby Negrin (Wikimedia) to ask Wikimedia whether they would be interested in OSMF running a tile service available to Wikipedia and if they would be willing to share hardware resources or expertise. [Topic: Wikimedia challenges with Tile CDN delivery]
  • 2020-09-09 Tom to update OAuth ticket https://github.com/openstreetmap/openstreetmap-website/issues/1408 #2020-09-09 Reportage, related to 2020-08-26 action item
  • 2020-09-09 [Not assigned] [Topic: AWS] Speak to AWS person about going ahead with open data program with official OSM S3 bucket.
  • 2020-09-09 [Not assigned] [Topic: AWS] Decide on services we need to run on AWS. Need clearance.
  • 2020-09-09 [Not assigned] [Topic: AWS] Work out rough budget.
  • 2020-09-09 [Not assigned] [Topic: AWS] Talk to OpenAerial Map/HOT.
  • 2020-09-09 [Not assigned] [Topic: Federating OSM communities' rooms through OSMF-hosted Matrix servers] Evaluate effort required. Constrain the scope to what we can support and perhaps ask volunteers to step in.
  • 2020-09-09 Paul to work out a proposal for the Ironbelly replacement. [Topic: Ironbelly replacement]
  • 2020-08-26 Tom to look at road ahead for OAuth. [Topic: Merge forums, OSQA, MLs to discourse?] https://github.com/openstreetmap/openstreetmap-website/issues/1408 #[[Operations/Minutes/2020-09-09|2020-09-09] Did some investigation - branch with some code. Better understanding of OAuth 2 and options. Doable.
  • 2020-08-26 Grant to talk to Ian about migrating old content to Discourse. [Topic: Merge forums, OSQA, MLs to discourse?] #2020-09-09 pending.
  • 2020-08-26 [not assigned] Create Github ticket for updated OAuth. [Topic: Merge forums, OSQA, MLs to discourse?]
  • 2020-08-12 Michal to try to rekindle excitement about people helping with imagery (on dev channel/imagery channel or Slack). #2020-08-26 No progress.
  • 2020-07-29 Grant to enable background sync to AWS S3. [Topic: Ironbelly] #2020-08-12 & 2020-08-26 Manually run, automated scripting to be added.
  • 2020-07-29 Grant to check with Wiki Admins on hCaptcha (reCaptcha replacement). [Topic: Wiki reCaptcha issue] https://github.com/openstreetmap/operations/issues/454 #2020-08-12 hCaptcha people reached out and happy to help. Blocker on Mediawiki 1.35 being released in August.
  • 2020-07-15 Paul and Grant to quote up a server to replace errol/kessie. [Topic: Replacement of Errol/Kessie]. #2020-08-12 A new person in OWG asked to do Errol. Need to replace it at some point - at UCL.
  • 2020-07-15 Ian to try converting fluxBB DB to go into Discourse. [Topic: OSM Forum (FluxBB) update]. # Evaluating whether moving is an option. Need to see about history, user log-in.
  • 2020-07-01 Paul to create a ticket about solutions to reduce incoming comms. [Topic:Revision of acceptable use policy to reduce incoming comms]
  • 2020-07-01 Grant to work out some of the questions for an online form as a solution to reduce incoming comms. [Topic: Revision of acceptable use policy to reduce incoming comms] #2020-08-12 need to think about the reply.
  • 2020-07-01 Michal to reach to AWS (need a story for AWS to show how their help will lead to AWS spending from users). [Topic: Commercial CDN for Bulk Tile Users] https://lists.openstreetmap.org/pipermail/talk/2020-May/084700.html #2020-08-12 Michal feels blocked, could draft something. We got contacted by AWS, not replied yet. Mοre info at 2020-08-12 reportage.
  • 2020-06-04 Paul to update the Github ticket "Adding API key support for tile.osm.org" https://github.com/openstreetmap/operations/issues/342
  • 2020-06-04 OPS team to draft an email (regarding a call for proposals), ask for comments. [Topic:Adding API key support for tile.osm.org https://github.com/openstreetmap/operations/issues/342]
  • 2020-04-10 OWG Push up tile usage policy (commercial entities, vehicle tracking applications - which are heavy on Nominatim and probably not attributing as well). [Topic: Commercial CDN for Bulk Tile Users]
  • 2020-04-10 Grant and Tom to work out a table of different data bits, work out how they are backed up and what can be potentially improved. [Topic: High Availability / Redundancy ofOpenStreetMap.org (and primary services)]
  • 2020-04-10 [Not assigned] Potentially move some more of backup data into long term S3 buckets. [Topic: High Availability / Redundancy of OpenStreetMap.org (and primary service)]

Action items from this meeting

  • Michal to email Fastly. [Reportage: Check with Fastly if they are happy to be credited as a top 3 donor].
  • Paul to write a recommendation that will be sent to the board. [Topic: Data Center move].
  • Grant to get contact info of ISP to Guillaume. [Topic: Improving networking in AMS].
  • Paul to check with serverdirect.nl and Nextron. [Topic: Shipping hardware].

Reportage

Cost of Amazon Web Services continuous integration runners:
~ USD 150/month for 100% concurrency
~ 1 USD/run (10 min, using Spot instances)

Discourse

  • There was an offer to find someone to help.
  • Pull Request welcome.
  • Not standard docker - they want people to use their wrapper script.
  • Foundational work might be needed (Chef support).

Chef

  • Install docker.
  • Install wrapper-script dependencies.
  • Manage wrapper-scripts and some of the config files it reads (e.g. wants to do Lets Encrypt management).

Decision: revisit on next meeting.

Docker on Errol issue

Randomly it doesn't want to talk to the network and has to be restarted.

> Restart docker-daemon.

Fastly

Action item: Michal to email Fastly.

IPv6 issues resolved?

Carried over from 2021-03-24 OPS meeting.

New breakage resolved.
Google-Cogent issue ongoing

OAuth

Work by Tom - mostly done.

Data Center move

Paul talked with a number of providers.
One will provide a revised quote.

Paul and Grant recommend Equinix in Dublin:

  • Price: similar range with other options.
  • Technically: no better options.
  • Operationally: easier going through 1 system.
  • Network: Better connectivity.and more options for peering.
  • Comfortable dealing with them.
  1. Some complaints about dealing with Equinix from our accountant.
  2. Treasurer: No strong feelings if Equinix is preferred.

Suggestion: Have a credit balance, so as to be 1-2 months ahead in billing, in case we encounter accounting problems.

Power
Suggesion for 3.
They can up it to 5, but there is no guarantee.

On other primary: Suggestion Agnostic on Amsterdam or Dublin being primary.

Other options: Interxion

  • SImilar connectiity to Equinix.
  • Major hubs in Dublin.
  • More expensive.
  • We would have to be dealing with 2 systems.

Decision: Equinix - once the last (revised) quote is received.

Action item: Paul to write a recommendation that will be sent to the board.

Michal had to leave.

Improving networking in AMS

Grant had past communication with other UK hosting provider who were willing to give 10 Gbit feed with less current.

Model: SG550X-48

  • 2 ports shared.
  • Using 1 port for cross-connect.
  • VRRP

https://www.cisco.com/c/en/us/support/switches/sg550x-48-48-port-gigabit-stackable-managed-switch/model.html

Backup: Text-config.

New expertise required.
New hardware, not necessarily needed.
Once set-up it's ok.

Action item: Grant to get contact information of ISP to Guillaume.

New rendering server

At capacity in Europe.

Suggestion: tile rendering in Dublin and AMS or Zagreb.

Decision: new server at AMS.

Shipping hardware

Don't do cross-border shipping. Too complicated at the moment due to Brexit.

Action item: Paul to check with serverdirect.nl and Nextron.

Next meeting

Wednesday 21 April 2021 18:00 London time