Cluster A

From OpenStreetMap Foundation


english deutsch français español portugês

Home page for Strategic Planning >  Strategic Planning Outline 2023 Cycle

Strategy Cluster A: Technical Development and Resilience

Strategy A01: Prepare Platform Services Demand Growth Forecast 2023–2028

The operation of the core infrastructure is at the heart of OSMF's mission. It needs to guarantee financial long-term stability of operations. To make planning more reliable, extend the financial planning horizon for recurring operational cost to 5 years.


Task A103: Financial Planning for 5 year horizon

  • Description: OWG already plans hardware and operations with a 5-year horizon. Budgets, however are only planned for a 1-year timeline. A financial planning for the 5-year horizon will help plan overall revenue planning for the OSMF and ensure that enought funding is available for operations.
  • Action: Extend the current 1-year budget, so that it contains a 5-year projection for financial planning of operations as well.
  • Responsible: X, working with OWG
  • Deliverable: Document showing 5 year financial forecast for platform services

Task A104: Financial risk assessment for sponsored hardware and services

  • Description: Part of the server infrastructure depends on sponsored hardware and services. Loosing sponsoring can put a significant financial burden on the OSMF.
  • Action: Assess sponsored hardware within the 5-year planning horizon and replacement strategies for loss of sponsoring.
  • Deliverable: Inventory of sponsored hardware and services together with cost for loss of sponsoring.

Strategy A02: Technical strategy for Hardware

The OSMF operates its own hardware infrastructure for core services. The hardware is operated in two core data centers, which provide the necessary redundancy. They are complimented by sponsored machines and hosting sites when necessary, in particular to reduce latency for mappers. Operations rely to a major part on volunteers to plan and administer the infrastructure. The volunteers are supported by paid staff.

Task A201: Review of Hosting Policy

  • Description: The OSMF not only hosts the core infrastructure around database and website but also a number of secondary services to support the website and tertiary services to support the community. There are costs in running the secondary and tertiary services, both in capital and operating costs. OWG operates a Service Classification Policy, which needs to be regularly reviewed and updated.
  • Action: Set up regularly reviews of the Service Classification Policy.
  • Deliverable: Annual updates of the policy document.

Task A202: Additional sysadmin

  • Description: The system administration team is mostly based in the UK at the moment. This occasionally leads to long response times when failures happen during the European night. The team of system administrators needs to expanded to cover a wider area of time zones.
  • Note: A full 24/7 standby would require at least 7 full-time system administrators. This is not the goal at the moment.
  • Action: Hire one system administrator in a different time zone to increase availability

Task A203: Consolidate Hardware

  • Description: OSM operation in the past have heavily relied on sponsored hardware and hosting. The result was a very diverse set of servers, which are difficult to handle and maintain.
  • Action: Consolidate the hardware to use a smaller number of server with similar specifications.
  • Project Status: Already on-going for purchases of new servers but there is still a backlog of old ones.

Task A204: Containerization of Services

  • Description: Task A203 will result in OSMF mainly operating a small number of large servers, while it still operates a large number of smaller services. These services will be containerized to be able to run many services on the same machine without interference. Containerization will also make it easier to test deployment and may a way to introduce more volunteer sysadmins.
  • Action: Consolidate existing services and containerize those that are small.
  • Project Status: Has just been started by OWG.

Task A205: Develop On-boarding Strategy for new Sysadmins

  • Description: Introducing volunteer sysadmins for the core infrastructure is difficult. There is a very steep learning curve, even for experienced sysadmins, to understand the specifics of the setup. There are no clear simple tasks to get started with. There is very little guidance for newcomers. For these reasons there haven't been any additions to the sysadmin team for years now.
  • Action: Identify areas of system administration which are suitable for beginners. Develop on-boarding documentation.

Strategy A03: Technical strategy for Core Software

The OSM Database, API and website are the core services that the OSMF operates for OpenStreetMap. The software for these services are purpose-made for the OpenStreetMap project and need to serve the goals of the community and nothing else. Next to the database, there are also a number of services that are vital to the community. These are, for example, the community forum which is important for communication or map tiles and search, which provide immediate feedback about the mapping process. Such secondary services are hosted by the OSMF, when needed. The software stack should incorporate existing open source as far as possible.

Task A302: Evolve the OSM Data Model and API

  • Description: The OSM data model and API have only seen smaller improvements over the last years. Because of the growing popularity of OSM, there are now many users affected by changes. It is no longer possible to fix perceived issues by simple changes to a single software. Last year's study on the data model lists some of the larger issues with the data model and proposes solutions. We need to reach a community consensus how to act on those proposals.
  • Action: Identify the changes that the OSM community wants to make and develop a project plan how to implement the changes involving all parties that are affected by them.
  • Prerequisite: greatly simplified by A307 (developing a strategy for breaking API changes)

Task A303: Support for Website maintainers

  • Description: The openstreetmap-website project has not received enough maintenance in the last year. As a result, there is a huge backlog of unresolved issues and open PRs. The code has amassed a significant technical debt, which hinders implementation of new features. The current maintainers do not have the time and resources to improve the situation and need support. This should be specifically support for maintenance task like ticket triage, code review and code cleanup (as opposed to new feature development).
  • Action: Prepare JD. Review available resources for voluntary workers. If suitable voluntary workers are not available, apply to the Board for a contractor position. If granted, appoint contractor.
  • Responsible:

Task A305: European General Data Protection Regulation (GDPR)

  • Description: The General Data Protection Regulation is a Regulation in EU law on data protection and privacy in the EU and the European Economic Area. The GDPR is an important component of EU privacy law and of human rights law, in particular Article 8 of the Charter of Fundamental Rights of the European Union. The OSMF wishes to modify much of the software in use to reduce personal information contained in OSMF websites. The changes required are documented but now need to be coded. There have been problems recruiting a Ruby developer from the volunteer world to undertake this work. See https://gitlab.com/osmfoundation/board/-/issues/131
  • Action: Split required actions into subtasks for website and operations. Work with OWG and website maintainers to develop an implementation plan. Procure resources to execute the plan.
  • Deliverable: No data

Task A306: Develop and set up a vector tile solution for the website

  • Description: The raster tile view on the main website has some severe limitations when it comes to helping the the mapping process. There is only one style, so much of the wealth of the database is hidden to mappers and users alike. The map shows labels in only one language which means the map is often unreadable outside the mapper's local language area. Vector tiles can improve the situation. There are some open source solutions available for vector tile stack, which have been evaluated in the last years for use within the OSMF infrastructure. None of the solutions was satisfactory, in particular, when it comes to the mandatory requirement that changes in OSM are almost instantaneously visible on the map. Furthermore, there is no vector tile style that reproduces the wealth of the current carto raster style.
  • Action: Develop and deploy a vector tile stack for use with the OSM main website. This includes the software to import and update data and create vector tiles as well as the initial base style.

Task A307: Develop a strategy for breaking API changes

  • Description: The OSM API has only seen non-breaking changes in the last years. The large number of clients and users has it made nearly impossible to follow the evolution strategy of the early OSM days and switch to a new API version in a concerted effort between all OSM software developers. We need to develop a strategy where the switch to a new API version can happen over a longer time.
  • Action: Investigate the implications of breaking API changes, develop a migration plan and make the necessary code changes.
  • Note: There is some preliminary work on this by Andy Allan, which can be used as a base for discussion.

Strategy A04: Supporting Software Development

There is a large ecosystem of software which is used in the mapping process: editors, QA tools, analysis tools and many others. In addition, there are tools and libraries which help with the development of this kind of software. The OSMF is committed to support free and open development though volunteers and provide financial support where necessary. It does not get involved into strategic decisions or the design process of the software unless there is a danger to the goals of OSMF.

Task A403: Encourage and Collaborate on Standardisation in the mapping ecosystem

  • Description: Need more detail on what this is about
  • Action: Expand the OSM role in encouraging standardisation and collaboration in the mapping ecosystem

Task A404: Foster ecosystem of open source tooling and libraries

  • Description: Need more detail on what fostering an ecosystem requires.
  • Action: Foster ecosystem of open source tooling and libraries

Task A405: Increase developer contribution for OSM software

  • Description: Many software project that are essential to the mapping process and data processing are maintained by only one or two people. Some are completely abandoned. This puts the operation of our core infrastructure at risk and makes OSM less attractive for data users. The OSMF should support expanding the developer community to ensure that OSM can rely on a healthy software ecosystem.
  • Action: Make an inventory of software strategically that is important for the OSMF core infrastructure and the mapping process and asses the bus factor. Promote contribution to OSM core software. Set up a system to support maintainers with on-boarding new contributors. Investigate feasibility of an internship programme.
  • Deliverable: Improved bus factor for at least one project per year.

Task A407: Supporting other editors in agreement with maintainers

  • Description: While iD is the main editor available on the website, there are additional editors maintained by community members which cater to other needs, for example, JOSM for power users, Vespucchi for mobile, StreetComplete for gamified mapping. The OSMF supports a rich editor eco-system.
  • Action: Identify the most important editors and contact maintainers to find out if and what kind of support they need.

Task A408: Encourage closer collaboration between software maintainers and EWG

  • Description: The EWG is tasked with coordination of software development efforts across the OSM ecosystem. To that end it is important that the EWG members are in close contact with the maintainers of OSM software as well as the operations team. These groups need to become more aware that the EWG exists and can be asked for support.
  • Action: Set up regular meetings between the developing community and the EWG. Increase promotion for the EWG.

Task A409: Improve relations between software maintainers and community

  • Description: OSM has the problem that development of OSM core software is stagnating. From an outsider perspective this is often perceived to be a result of gatekeeping by maintainers, while the actual reasons for the stagnation are more complex. The negative gatekeeping image has a self-enforcing effect of discouraging contribution and thus enforcing stagnation even more. One way to improve the image is to increase visibility of the state of the software and plans for future development and invite community feedback.
  • Action: Invite maintainers of core software to blog about their activities and plans on the official OSM blog. Set up community consultations with maintainers.
  • Deliverable: Regular public reports on the state of core software. Regular opportunities to meet the maintainers.

Strategy A5: IT security review and plan

Introduction:

The goals of OSMF with regard to offering a mapping platform is that the platform should be resilient and stable. Resilience is the ability to continue running after small or large errors in hardware and software. One example is the ability to survive failure of a disk drive. The term "bounce back" describes resilience. Stability is the ability to offer an unchanged and uninterrupted set of services to users for long periods of time. Highly stable systems will run for hundreds of days without interruption of service. The OSMF infrastructure is classified into three tiers. They are defined in the Service classification policy. Depending on the service level, different strategies are applied for resilience and stability.

Task A501: Safety of Backups

  • Description: All data is backed up at least weekly. In addition, the replication log files for the main database are backed up off-site, so that full recovery at any point in time is possible. There only exists a single point of backup in Amazon S3 at the moment. To be resilient against compromised backup, a redundant backup with different access credentials is needed.
  • Action: Set up backup replication to an external account.
  • Deliverable: Backup are stored in two independently accessible places.

Task A502: Backup Recovery tests

  • Description: While data from the OSMF infrastructure is safely backed up, the backups are not tested for integrity and if they can be used to successfully restore services.
  • Action: Set up a policy for regular trail restore of backups and implement the necessary infrastructure.
  • Deliverable: Backups are tested at least once a year.

Task A503: Separation of privilege for Administrators

  • Description: The current administration setup via chef does not allow restricted access for admins to specific services. Once an admin has access to chef, they have effectively control over the entire system. We work around this by giving people access to specific machines and requiring pull requests to chef. This does not scale and hinders on-boarding of new administrators.
  • Action: Develop a plan to give service-specific access to the administration setup.
  • Deliverable: Administrators for specific services can manage their infrastructure independently.

Task A504: Review access to services

  • Description: OSMF runs many services with logins restricted to people with specific roles (e.g. as part of a working group). There is currently no system in place to revoke privileges when people leave.
  • Action: Review services with special access permissions and define policies, who gets access under which circumstances. Develop a policy for revocation of access. Ensure that logins to these services are secure, for example, by enabling two-factor authentication.
  • Deliverable: Documented access privileges with up-to-date lists of who has access to what. Privileges are revoked when people leave.

Task A505: Smart deployment

  • Description: Software in the OSMF infrastructure is regularly updated with security updates and new features. There is currently no infrastructure to test deployments before they are put into production. Failures in deployment therefore lead to system-wide outages.
  • Action: Introduce smart deployment through blue-green or canary model.
  • Deliverable: New deployments are rolled out in stages.


Task A506: Test and documentation of Fall-over Procedures

  • Description: The main plan for disaster recovery in one data center is fallback to the other data center. The fall-over is currently not documented and tested.
  • Action: Develop, test and document and plan for fall-over between data centers.
  • Deliverable: Documentation of fall-over plans

Task A507: Documentation of System Administration Policies and Procedures

  • Description: The system administration team has developed a number of strategies over the years for security, reliability and disaster recovery. Much of this is well known to the administrators but very little is written down outside of the management utility chef. The result is a relatively high bus factor in system administration.
  • Action: Collect and document policies and procedures for system administration.
  • Deliverable: Documentation sufficient that an experienced external system administrator can recover the OSMF infrastructure.