• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Publish What You FundPublish What You Fund

The Global Campaign for Aid and Development Transparency

  • RSS
  • Twitter
  • Vimeo
  • Youtube
  • LinkedIn
  • Facebook

NEWSLETTER

CONTACT

  • Why it matters
    • Why transparency matters
    • The Story of Aid Transparency
    • What you can do
    • Case studies
  • Aid Index
    • 2022 Index
    • Comparison Chart
    • Methodology
    • Index Archive
    • Tools
  • DFI Index
    • DFI Transparency Index 2023
    • DFI Research
    • DFI Transparency Tool
    • FAQs
    • Project Advisory Board
  • Our Work
    • Women’s Economic Empowerment
    • Localization
    • Gender Financing
    • Humanitarian Transparency
    • US Foreign Assistance
    • Data Use
    • IATI Decipher
    • Improving UK Aid Transparency
    • Webinars
    • Work Under Development
  • News
    • Reports
    • News
    • Events
    • Blog
  • About Us
    • Board
    • Team
    • Our transparency
    • Our Funders
    • Jobs
    • Annual Reports
    • Friends of…
    • FAQs
Show Search
Hide Search
Home / Blog / Locating the geodata – an IATI experiment
blog

Locating the geodata – an IATI experiment

By James Coe | Mar 16, 2017 | Blog

This blog post was written by James Coe of Publish What You Fund and Nick Hamlin from Global Giving as part of the Initiative for Open Ag Funding. It was initially published on InterAction’s website here. 

 

Using Project Documents to Simplify Location Data Publication

Location data matters. Despite being one of the least published pieces of data on the International Aid Transparency Initiative (IATI) Standard, it is consistently highlighted as one of the most important. Without information on where donors are spending their money, aid practitioners are less able to avoid project duplication or identify gaps in funding. As part of the recent Tool Accelerator Workshop, organized by the Initiative for Open Ag Funding, a team of staff from GlobalGiving, Development Gateway and Publish What You Fund worked together to see if we could enhance and simplify the publication of this vital, yet lacking, data.

The idea was simple: identify location names based on unstructured project documents and turn them into geocodes so that they could, for example, be plotted on a map. Like every data science project, our first step was to find some reliable input data upon which to build. We found a treasure trove of potential sources embedded in the document links included in IATI activity files. Within these, evaluation documents seemed to be the best potential source of location data, so we created a library of these files to quickly sample from.

Data extraction

Documents in hand, we now had to try to extract the relevant location names and other useful information. Unfortunately, the types of documents that organisations attach to IATI activities are extremely diverse and come with a wide variety of structures and file formats. To begin to make sense of them, we first converted them into raw, uniform text. We then used a technique called named entity recognition to distill the messy raw text into the clean lists of organisations and locations hidden inside.

Thanks to Python’s natural language processing tools, we were able to begin iterating quickly on our corpus of documents and quickly discovered what data points could be easily extracted. The excerpts of our code on Github describe this process in more detail. In the end, our result looked something like this (emphasis added):

Location Data Points

We now had a list of sub-national locations; some relevant, some not. Our next step was to convert these names into geocodes. Much of our exploration at this stage focused on whether these results could be integrated with Development Gateway’s Open Aid Data Geocoder, which can assign precise geospatial data based on the location names in accordance with the IATI schema.

Several practical hurdles remain before an approach like this could be used in production, but by the end of this short exercise we had discovered a simple way of automating IATI data creation by scraping text from project documents, extracting location data and then converting these results into geocodes.

What next?

If we are to take this idea further, our next step would be to identify the earliest instance in which subnational locations appear in available project documentation. Since evaluation documents are only available at the end of a project’s lifecycle, they are not an ideal choice for completing forward-looking IATI data, although they could be used to enhance data already on the IATI registry. We will also need to test ways to implement this approach at scale before integrating it into donor publication processes.

In any case, what we have learned so far is that location data is central to improved coordination and, with this automated extraction and geolocation process, it is well worth the effort to explore how we can improve its publication and availability.]

Primary Sidebar

NEWS Topics

Africa Agriculture Aid transparency Aid Transparency Index Australia Budget ID Canada China Climate Change Data Revolution Data use Data Visualisation Development Finance institutions DFI Spotlight DFI Transparency Tool European Commission Financing for Development France Freedom of Information Gender Germany GPEDC Humanitarian Impact International Aid Transparency Initiative Japan Joined-up data Kenya Letters MDGs Newsletter OECD Open data Open government Press Releases Publish What You Fund Road to 2015 Sustainable Development Goals Sweden UK United Nations US Webinar Women's Economic Empowerment World Bank

Twitter

  • Starting now! Is USAID’s localization mission falling short? Join @sallyppaxton in conversation with @Devex… https://t.co/riMxuVGi4Y
    Mar 29, 2023
  • Join us in one hour when @sallyppaxton will be discussing “Is USAID’s localization mission falling short?” with… https://t.co/GLP8VEsd4z
    Mar 29, 2023
  • This @icai_uk report also calls for the Home Office to be more deliberate, urgent and transparent in how it address… https://t.co/3RNzVeeq5w
    Mar 29, 2023
FOLLOW US
  • Contact Us
  • Copyright
  • Privacy Policy
  • RSS
  • Twitter
  • Vimeo
  • Youtube
  • LinkedIn
  • Facebook

Publish What You Fund. China Works, 100 Black Prince Road, London, SE1 7SJ
UK Company Registration Number 07676886 (England and Wales); Registered Charity Number 1158362 (England and Wales)