Operations/Minutes/2024-01-25

From OpenStreetMap Foundation

OpenStreetMap Foundation, Operations Meeting - Draft minutes

These minutes do not go through a formal acceptance process.
This is not strictly an Operations Working Group (OWG) meeting.

Thursday 25 January, 19:00 London time
Location: Video room at https://osmvideo.cloud68.co

Participants

Minutes by Dorothea Kazazi


New action items from this meeting

  • Grant Slater to set-up PagerDuty and provide instructions to OPS on how to trigger PagerDuty manually if we notice the website is down. (there is a ticket) [Topic: When to page someone for a problem ]
  • OPS to publish the schedule for OAuth [Topic:OAuth 1.0a]
  • Grant Slater to contact the CWG about a blogpost on OAuth 1.0a deprecation. [Topic:OAuth 1.0a]
  • Grant Slater to publish the schedule for OAuth 1.0a deprecation on mailing lists/Discourse/Mastodon/Twitter [Topic:OAuth 1.0a]
  • Tom Hughes to investigate what happens to existing OAuth 1.0a tokens when OAuth 1.0a gets turned off, and whether error messages can be sent back in the html of the response to basic/oauth 1.0a users. [Topic:OAuth 1.0a]

Reportage

Create a test parallel version of tile CDN

[2024-01-11 Topic: Serving different images instead of errors to tile requests]

Need DNS and config.

Find the image that Paul had created for serving instead of errors to tile requests

[2024-01-11 Topic: Serving different images instead of errors to tile requests]

  • Image found - it is in base64.
  • No suitable repository to put them in yet.
  • Will wait for openTofu/Fastly set-up.

Post on the PR https://github.com/openstreetmap/openstreetmap-website/pull/4379 and do a revision of the operations' policy

[2023-11-30 Topic: Hosting provider credit policy]

Done.

Update the PR https://github.com/openstreetmap/openstreetmap-website/pull/4126

[2023-11-02 Topic: OPNVKarte]

Done.


OWG 2024 budget

  • Roadmap missing.
  • OPS are not ambitious with the budget.
  • Grant had a call with OpenAerialMap, who are starting to run into budget issues, are looking for options for storage.
  • The imagery service that Grant stated will likely be used in 2024 more than what was intended.
  • Desire by board members for OSMF to host more open datasets.

Assumption

  • Expenses on subscriptions (Internet/Rack hosting) are similar to last year.

On the format of the budget the OWG provided

  • This is what the board asked.
  • Fundraising team members are ok with it.

Budget includes

  • 6.2K Operational contingencies.
  • 11K Capital contingencies.

Suggestions on amounts

  • Increase the amount for upgrades. Current amount in the budget: 3,400.
  • Include something for imagery.
    • Suggestion withdrawn.
  • Break down the amounts for Equinix/Internet/Remote hands.
    • Do it next year.
    • Remote hands expenses last year: ~ 1,500.
    • Hosting: ~ 20K/site.

Suggestions general

  • If there are going to be substantial changes to the OPS budget, someone else than Paul has to prepare them, given the time he put into preparing the budget.
  • Join the Finance Committee meeting - Wednesday after 1500 NY time.
  • Highlight to the board that this year we're not spending much money on hardware, as we are drawing more value from current servers.
  • Equinix/Remote hands: Start to track how much we're spending.

On future hardware replacements

  • Most of the servers have the same age, so when we replace them, it's probably going to be over half them.

Decision: Agreement to provide to the board the budget that Paul created, highlighting that this year we're not spending much money on hardware, as we are drawing more value from current servers.

Other points mentioned during discussion

  • No point on amortising/depreciating items into our budget, when the money has been spent.
  • If there are contingencies the board will approve extraordinary budget.
  • Paul had budgeted for 2 servers failing.
  • Capacity increases has to flow down from capacity planning. Don't upgrade for upgrades sake.
  • Previous budgets not adequately prepared but ok, given the time.
  • The idea is to raise similar amount of money every year.

OAuth 1.0a

Schedule discussed in previous meeting

  • March 1st: stop new 1.0a registrations
  • May 1st: begin brownouts of 1.0a and HTTP basic.
  • June 1st: discontinue the 1.0a registration and basic authorisation.

Suggestions related to error messages

  • Customise error message when OAuth token is used, to inform users, providing e.g. a wikilink.
    • We are returning "false "from various methods which go into the library. Library terminates urls.
  • Set URL which sends back a static HTML file.
    • Does not help people who already have tokens - only people with new logins.
  • Minimum url in the body, if possible, pointing to wiki or a GitHub ticket.
    • More likely for an API call with an existing token, than for an OAuth request to set a new token.

General suggestions

  • Check on Dev server what will turn on/off by disabling it.
    • It's not trivial.
  • Up to the maintainers to see what can be done.

Other points mentioned during discussion

  • Unclear when stop using existing token.
  • Error messages would be for the persons looking at their traffic, when their users complain.
  • Brownouts: trivial
  • Up to maintainers to see what can be done.

Action items

  • OPS to publish the schedule of OAuth 1.0a/basic Auth deprecation.
  • Grant to contact the CWG about a blogpost on OAuth 1.0a deprecation.
  • Grant to publish the schedule for OAuth 1.0a deprecation on mailing lists/Discourse/Mastodon/Twitter.
  • Tom to investigate what happens to existing OAuth 1.0a tokens when OAuth 1.0a gets turned off, and whether error messages can be sent back in the html of the response to basic/OAuth 1.0a users.

Basic OAuth deprecation

  • Website maintainers: less worried about basic OAuth.
  • Few people use basic OAuth.

Suggestions

  • Do brownouts.
  • Put a URL into the default 403 message.
    • Then you send it to everybody, which might be confusing.
    • It provides an answer.

Other points mentioned during discussion

  • It is hard to give any sort of error in this case.
  • When you turn off basic Auth, it i) stops sending the trigger response, and ii) it ignores any authentication credential supplied and treats it as an unauthorised request, sending a 404.

Schedule for deprecation

  • 1 May
  • 1 June

When to page someone for a problem

Action item: Grant to set-up PagerDuty and provide instructions to OPS on how to trigger PagerDuty manually if we notice the website is down. (there is a ticket)

On PagerDuty

  • It supports multiple alerting options, has alert escalation, roster support, holiday calendars, can be plugged into anything and can be customised.
  • Can be e.g. set-up to be triggered by inbound emails to a specific address.
  • We are not paying for it, it's sponsored via OSM US. We got 4 accounts, with full admin access.

Plan

Set-up alerts by PagerDuty to Grant when,

  • Planet is down
  • www.osm.org is down

Delays in minutes

  • There were complaints by community members about the time that it takes for OPS minutes to get published on the OSMF website.
  • Minutes for OPS-consumption are also available earlier on internal pads.

If OPS want to check things earlier, Dorothea offered to:

  • Prioritise minuting specific topics.
  • Provide the recording.
  • Provide the draft notes from the meeting - which would be incomplete and might include some inaccuracies.

Suggestion

  • Use hack.md and then Pandoc for conversion.

The group was ok if the minutes get published on the OSMF website before the next OPS meeting and it was left to Dorothea to decide if it would make sense to migrate from the current pads to hack.md. In case of migration she will notify OPS about the new urls.


OSQA

https://github.com/openstreetmap/operations/issues/149 (migrate OSQA)
https://github.com/openstreetmap/operations/issues/862 (containerise OSQA)

Agreement on the need to shut down help.openstreetmap.org. Need for a plan.

Suggestions

Related to Discourse

  • Import to Discourse: not possible with existing path.
  • There is a QA extension, which is not very good, but better than OSQA.

Plan

  • 1 March 2024: Disable logins/new questions on help.osm.org
  • Put a notice on help.osm.org with the shutdown date and encourage people to visit community.osm.org.
  • TBD: Static version of OSQA site.
  • Shut down.

On static archive

  • It will be publicly accessible.
  • Suggestion to remove it from Google's index after a period of time.

Other points mentioned during discussion

  • An older version of ASKbot has a converter from OSQA.
  • ASKbot was used by Fedora.
  • DB schema is different.
  • If someone wants to do the work, they can step forward.
    • Even in this case there will be work for OPS, to set-up the test environment.
  • Somebody pointed out to Apache ask.

Action items reviewed at the beginning of the meeting

  • 2024-01-11 Grant to create a test parallel version of tile CDN. [Topic: Serving different images instead of errors to tile requests]
  • 2024-01-11 Paul to find the image he had created. [Topic: Serving different images instead of errors to tile requests]
  • 2024-01-11 OPS to move to Fastly config as code. [Topic: Serving different images instead of errors to tile requests]
  • 2023-11-30 Grant to revisit the "policy for purchasing" document, which currently is focused on specs, and add information such as the process for obtaining approval for purchases. [Reportage] Added info: Who Approves / Steps etc
  • 2023-11-30 OPS to review the issue of spam reports to ISPs in 6 months (May 2024)
  • 2023-11-30 Paul to post on the PR https://github.com/openstreetmap/openstreetmap-website/pull/4379 and do a revision of the operations' policy [Topic: Hosting provider credit policy]
  • 2023-11-02 Grant to update the PR https://github.com/openstreetmap/openstreetmap-website/pull/4126 [Topic: OPNVKarte]
  • 2023-05-18 Paul to start an open document listing goals for longer-term planning. [Topic: Longer-term planning]