Operations/Minutes/2024-01-11

From OpenStreetMap Foundation

OpenStreetMap Foundation, Operations Meeting - Draft minutes

These minutes do not go through a formal acceptance process.
This is not strictly an Operations Working Group (OWG) meeting.

Thursday 11 January, 19:00 London time
Location: Video room at https://osmvideo.cloud68.co

Participants

Minutes by Dorothea Kazazi


New action items from this meeting

  • Grant Slater to create a test parallel version of tile CDN. [Topic: Serving different images instead of errors to tile requests]
  • Paul Norman to find the image he had created. [Topic: Serving different images instead of errors to tile requests]
  • OPS to move to Fastly config as code. [Topic: Serving different images instead of errors to tile requests]

Reportage

Revisiting the "policy for purchasing" document

Action item: 2023-11-30 Grant to revisit the "policy for purchasing" document, which currently is focused on specs, and add information such as the process for obtaining approval for purchases. [Reportage]

  • Pending.
  • Who Approves / Steps etc

Facebook logins

Action item: 2023-11-30 Tom to check whether OSMers with Facebook login got null passwords or random strings that don't function as passwords. [Topic: Facebook logins]

Logins

Action item: 2023-11-30 Paul Norman to open an OPS ticket on the logins which have passwords stored as md5 or salted md5. [Topic: Logins which have passwords stored as md5 or salted md5]

Passwords

Action item: 2023-11-30 OPS to find out how many users have null password or old passwords or random strings as "virtual" salts. [Topic: Logins which have passwords stored as md5 or salted md5]

Add corporate members to the list of supporters

Action item: 2023-11-30 Paul to post on the PR https://github.com/openstreetmap/openstreetmap-website/pull/4379 and do a revision of the operations' policy [Topic: Hosting provider credit policy]

  • Completed, and a circular has been sent.
  • OPS to reply to the circular.

OPNVKarte

Action item: 2023-11-02 Paul to update the PR https://github.com/openstreetmap/openstreetmap-website/pull/4126 [Topic: OPNVKarte]

Policy for addition of OSM editors to the osm.org menu

Action item: 2023-06-29 Grant to put Martijn's policy for addition of OSM editors to the osm.org menu out for feedback.[Topic: Draft policy by Martijn] - https://community.openstreetmap.org/t/proposal-osm-org-editor-inclusion-policy/106589

  • Done.
  • Someone had summarised the thread.

Merging Pull Requests over objections of others

Running low-zoom tiles daily https://github.com/openstreetmap/chef/pull/627

Discussion on seeking for advice from other team members for a pull request and then proceeding, even when one of the team members had said they had issues with it and later had mentioned over IRC that the PR should not be done.

Issue: Sense of undervalued opinions and a need for evidence-based discussions in decision-making. No point in commenting.

Incentive for PR: Grant wanted to get the data to solve a problem that mappers were having.

Notes on PR

  • Grant asked for advice.
  • Paul mentioned that machines take 4 hours to do a re-render and it we shouldn't run daily.
  • Grant merged the PR and reverted it, as it caused problems that Tom had predicted.
  • Then Tom merged it again, after the memory issues were fixed.
  • Tom moved the rendering to local 23:00.

Specific concerns

  • Absence of evidence from the ticket.
  • Servers spending 20-25% of their time daily on low zoom renders.
    • They would be idle anyway.

Points mentioned during discussion

  • Grant had tried to work out whether a four-hour period would be a problem.
  • Suggestion: To see how long the re-render would take, could have run it on one of the servers, like Tom did.
  • Grant was happy to revert the PR as soon as there was an issue, which he did.
  • Grant changed the swap space on all the machines and then reverted that change, as it was worse.
  • Tom took one of the Australian servers out of use, as the other one was capable of handling the load, and used it to run tests.
  • The downtime period is a minimum of a six-hour window. Our peak is 16 to 18 hours a day.

Other points mentioned

  • Running an experiment is ok.
  • There is actual evidence now, which was lacking at the time.

Thoughts by Grant on what he could have done

  • Could have included some of his evidence on why 23:00 was a suitable time to run daily.
  • Could have addressed specifically the concern about the four hours and the load.

On waiting time

  • Grant viewed this as an experiment and didn't want to wait for a week.
  • OPS had waited 1 month until the ticket got opened and merged.
    • There were additional changes to threading, as it was still hitting the memory.

Suggestion: Communicate more when OPS members believe something is of importance, and back it up with data.


2024 budget

latest version: dated 19th

  • The budget was sent on 2023-12-24.
  • Harrison Devine (fundraising advisor team) did not get in touch with OPS.
  • Parts that still need some work concern risk assessment and some explanatory text.
    • A request for more detailed cash flow analysis was determined to be the responsibility of accounting.

General suggestions

  • Cash flow analysis.
  • OPS to focus on hardware and others to determine if money is available.
  • Should be more ambitious concerning infrastructure purchases the coming year.

Suggestions for budget additions

  • Additional SSDs or increasing storage on more servers than just four.
    • Storage is easy to predict on an annual basis.
    • Possibility of the OSM US Slack having to be replaced by the Discourse's chat feature, which may increase infrastructure demands.
      • There is a new machine we could move Discourse to.
  • Volunteer engagement fund, i.e. to pay for beverages at side-meetings happening at conferences, estimated at EUR 1,000.
  • Trainings.
  • Fund for in-person meetings at conferences where OPS member would go anyway, i.e. for booking a room.

Other points mentioned

  • Additional budget requests can be made throughout the year if needed, therefore there is no need to exhaustively plan for all eventualities.

To be added to the budget

  • EUR 1000 - volunteer engagement.
  • EUR 1200 - allowing an extra day for in-person meeting to conferences where OPS member would go anyway (travel)

Infrastructure

Suggestion: Retire Kessie (HP ProLiant SE326M1R2), currently used for imagery and hosted by Exonetric.


Serving different images instead of errors to tile requests

Proposal was made by Guillaume Rischard to serve different images instead of errors to tile requests, as the French OSM community does with good feedback.

  • Advantages: Informs non-technical people of what the issue is. Much more likely to get a positive action to fix the missing attribution, as the image is saying exactly why they're being blocked and all their users are made aware.
  • Risk: If response headers are not right - might become a caching problem, polluting Fastly and browser cache.
  • Objection to: Sending a 200 response when it should be an error response.

Other points mentioned during discussion

  • We responded with an error code and an image around ~10 years ago, until it was difficult to do so.
  • If there is just an error code, there is no visibility to end user - they see a blank screen.
  • Paul had tried it once.
  • We would be creating a feedback loop to people using our tiles.

Fastly

  • Have increased the number of backends we can have.
  • Given us the rate limiting capability.

Next step: Paul to search for the image he had created in the past.

Suggestion

  • Create a parallel test version of the tile CDN so that we can run some tests.
    • Some parts of the config may not work, e.g. the data objects.
    • Some adjustments might be necessary.
    • We have to request increased back ends and rate limiting.

Action items

  • Grant Slater to create a test parallel version of tile CDN.
  • Paul Norman to find the image he had created.
  • OPS to move to Fastly config as code.

Spam reports

  • Deferred until May.

Facebook

  • They approved the request and the Facebook login is available once more.
  • The early rejections did not give rationale.
  • Later one were about address details not matching.

OAuth 1.0a schedule

Probably there will not be any more decline. Opened tickets for big users.

Suggestions

  • Check with mmd about 2024-06-01 (Saturday) to deprecate basic OAuth and OAuth 1.0.
  • Performing brownouts closer to that date would be a good idea (April to May 1).
  • Make a formal announcement before the next OPS meeting, if mmd agrees.
  • Disable registration of new OAuth 1.0 stuff by 2024-03-01.

OSMCal uses OAuth 1.0

Next step: Paul to ping mmd on the ticket.


QGIS tile usage

  • QGIS has a built-in scrapper.
  • No way of identifying anything that comes from QGIS.
  • We need to produce data, examples, and metrics on what percentage of traffic we serve is QGIS - we can provide that to the QGIS developer team.
  • A relevant ticket has been created https://github.com/openstreetmap/operations/issues/1019
  • On traffic: traffic to QGIS is more than osm.org.
  • Paul mentioned that there are graphs available for use.

Action items reviewed at the beginning of the meeting