Transforming GeoSpatial Data: My Journey with the Overture Project in the MLH Fellowship

Transforming GeoSpatial Data: My Journey with the Overture Project in the MLH Fellowship

As part of the MLH Fellowship 24.FAL.A, I’ve embarked on a journey into the world of geospatial data, a field new to me but one I’ve grown passionate about. From early feelings of imposter syndrome to gaining confidence with every challenge, this experience has deepened my appreciation for technology and the power of persistence. Alongside talented teammate Petar Dosev and with guidance from Meta maintainers Christopher Beddow and Benjamin Clark, I’ve been developing a tool to convert GeoJSON data to OpenStreetMap (OSM) format—a project that’s about more than coding; it’s about enhancing access to valuable geospatial data for communities everywhere.

Understanding the Overture2OSM Concept

The Overture project centers on leveraging two significant map datasets, each with unique advantages:

  • Overture Maps: Rich with user-generated data about local businesses, Overture includes information such as hours, customer reviews, and business categories—often missing in global datasets—courtesy of Facebook and Microsoft users.

  • OpenStreetMap (OSM): Known for its global reach and community-driven accuracy, OSM lacks some hyper-local details due to varying user contributions. The strength of OSM is its broad coverage, yet it benefits from enhancements to local data.

The diagrams below illustrate the contrast between these two datasets for Lagos, Nigeria:

Figure 1: Overture Map of Lagos, featuring user-contributed local business data.

Figure 2: OpenStreetMap for Lagos, renowned for comprehensive coverage but lacking some hyper-local business insights.

Our mission with Overture2OSM is to bridge this gap by creating a tool that ensures data can move seamlessly between these datasets, ultimately improving the accessibility and quality of OSM’s information.

My Contributions to the Project so far

XML Processing and Validation

In order to enable accurate data handling, I focused on XML processing and validation, a critical skill in geographic information systems (GIS). Using JavaScript’s xml2js library, I implemented a parsing function to convert XML data into manipulatable JavaScript objects. Here’s an excerpt of my code:

const limit = pLimit(5);
const argv = minimist(process.argv.slice(2));
const filenames = argv._;
const outputOption = argv.output || 'both';

if (filenames.length === 0 || argv.help) {
  console.log(`GeoJSON to OSM Data Converter\n\nUsage: o2o-cli <file.geojson> [...] --output <console|file|both|xml>`);
  process.exit(0);
}

This setup allows flexible output options, enhancing the tool’s versatility.

Batch Processing and ConversionBatch Processing and Conversion

Handling multiple files at once is key to making the tool user-friendly. I designed a batch processing function that allows users to upload and process multiple GeoJSON files simultaneously, checking for large file sizes and handling each based on its attributes. Here’s a look at the batch processing function:

async function processFilesWithBatching(filenames) {
  const validFiles = filenames.filter(filename => filename.endsWith('.geojson'));

  await Promise.all(validFiles.slice(0, 5).map(filename =>
    limit(async () => {
      try {
        const content = await fs.readFile(filename, 'utf-8');
        const data = JSON.parse(content);
        const osmData = geojsonToOsmXml(data.features);
        await saveToXmlFile(filename, osmData);
      } catch (error) {
        console.error(`Error processing ${filename}:`, error.message);
      }
    })
  ));
}

This function efficiently manages file processing in batches, reducing conversion time and improving usability.

API Development and File Handling

Using Express.js and Multer, I built a robust API for file handling and GeoJSON-to-OSM conversions. The API manages file uploads, validates data, and responds with output options based on the user’s preference. Below is a simplified example of the API’s setup:

import express from 'express';
const app = express();
app.use(express.json());

app.post('/api/convert', upload.single('file'), (req, res) => {
  // Conversion logic here
  res.send('File converted successfully');
});

This API enables seamless data conversion while ensuring a smooth user experience with essential error handling.

Data Normalization and Testing

Ensuring consistency in data is critical for OSM standards. I implemented functions for normalizing coordinates, street names, postal codes, and more. For example, this function normalizes street names:

const normalizeStreetName = (name) => {
  const abbreviations = { 'St.': 'Street', 'Ave.': 'Avenue' };
  return name.split(' ').map(part => abbreviations[part] || part).join(' ');
};

To verify functionality, I added unit tests using Mocha to ensure the function’s accuracy:

describe('normalizeStreetName', () => {
  it('should normalize street abbreviations correctly', () => {
    assert.strictEqual(normalizeStreetName('St. John Ave.'), 'Street John Avenue');
  });
});

Testing ensures consistent output quality and sets a strong foundation for maintaining the tool over time.

Enhancing Project Structure

Maintaining an organized codebase is essential for collaboration. I recently created a dedicated test folder, simplifying project management and encouraging contributions. This organizational improvement has already made it easier for team members to navigate and collaborate.

Looking Ahead

In the remaining weeks of the MLH program, we aim to add features for larger datasets and explore visualization options, helping users better understand data transformations. We’re also considering integrating additional data sources to enrich OSM with hyper-local business information and improve usability across diverse applications.

Acknowledgments

A huge thank you to my MLH teammates and maintainers from Meta! This journey has not only expanded my technical skills but also underscored the power of collaboration and community in tech.

Conclusion

Reflecting on my journey, I’m proud of my progress. I went from doubting my place in the field to contributing meaningfully to a project with real-world impact. It’s been an exciting, challenging experience—and a reminder to keep embracing new challenges with curiosity and determination.