Coupling AI and Sustainable Development Goals through Public Expenditure Data: Why Fiscal Transparency is Crucial to Achieve the 2030 Agenda

Since their establishment in 2015, the Sustainable Development Goals (SDGs) have become the leading international agenda to promote social, economic and environmental development. The 2030 Agenda has encouraged the construction of numerous development indicators through which governments can track and evaluate their progress towards the SDGs. For example, the United Nations provides open access to the official SDG database; alternatively, the World Bank has collected enough indicators to build an ‘unofficial’ SDG database; and countries like Mexico are producing comprehensive indicators at the subnational level ( Indicators, however, are only one side of the development equation: they constitute the output. However, realistic strategies to reach specific goals of a government cannot be properly designed without information about the input side: public expenditure.

To a reasonable degree, a government’s development strategy can be captured by its allocation of resources across different policy issues. The aim of such strategies is transforming the relevant indicators to, eventually, reach the government’s goals. Ironically, while we have plenty development-indicator data, there is scant information on public expenditure. However, in recent years, the Global Initiative for Fiscal Transparency (GIFT) has been trying to change this shortfall. But, even if fiscal data would become available, the complexity of the policymaking process obfuscates our understanding on how public expenditure translates into development outcomes. This is so because, for example, government agencies may have different goals from those of the central authority (misalignment of incentives), there may be inefficiencies due to corruption or lack of capacity and, on top of that, there are spillover effects between public policies (synergies and trade-offs). Fortunately, cutting-edge Artificial Intelligence (AI) research is trying to cope with this challenge. In this post, we argue that these technological advances in conjunction with open fiscal data can enable governments to fully exploit SDG indicators in the pursuit of worldwide development. For this reason, it is crucial that governments focus their efforts in two fundamental tasks: (1) producing granular open fiscal data and (2) linking it to development indicators.


AI for Sustainable Development

At The Alan Turing Institute, we are developing the computational framework of Policy Priority Inference (PPI). PPI builds on a behavioral model of the policymaking process. Among the most relevant aspects considered, we can find the learning process of public officials, coordination problems, incomplete information, and imperfect monitoring mechanisms. PPI simulates the complex and uncertain dynamics observed in development-indicator data. In parallel, it takes into account the network of interdependencies between development indicators, which describes positive and negative policy spillover effects. The method can be used with datasets that have more indicators than observations; such as those built by most governments. Likewise, it does not require an insurmountable collection of information, so it is scalable to big or small data.

The innovation of using an AI tool such as PPI is that, by explicitly modeling the policymaking process that generates development indicators, we can simulate data that is not observable in the real world. One such data is precisely public expenditure on SDGs. In addition, PPI can also estimate, at the level of each indicator, how efficiently are the resources being used. In a recent paper (, we use this technology to measure policy coherence, a topic that has not been properly quantified and yet, is central to multilateral organizations such as the OECD ( PPI reproduces empirical development indicators and, as a byproduct of its behavioral model, it simulates the distribution of resources that gives place to the observed indicator dynamics. We use these distributions to construct a policy coherence index that can be used to compare how consistent are the policy priorities of a government in relation to its development goals. Our index can be extremely valuable to assess how committed are governments towards the 2030 Agenda –or any other– and to identify those policy issues that should be prioritized.


AI + Open Fiscal Data

So, in absence of open fiscal data, PPI simulates budgetary allocations that generate real-world development indicators. However, we can do much better if we feed this technology with public expenditure data. For example, if governments provide a detailed account of their allocations at the level of each development indicator, PPI can be tuned to match those expenditure patterns. This would represent a major improvement in the estimation of policy priorities. To get a clear picture of why this would be the case, let us provide some context on the tools that are currently being used to advise governments.

Traditionally, development consultants have attempted to approximate the effects of improving specific indicators by measuring their level of association. That is, by looking at metrics such as correlations or regression coefficients, analysts try to disentangle how, for example, improvements in the aquatic environment (SDG 14) relate to changes in GDP growth (SDG 8). These associations, while illustrative, are not informative about how to produce environmental improvements. In other words, by solely focusing on the output side of the problem (indicator data), we can only learn how an indicator in SDG 14 relates to another one in SDG 8, not how a specific budget program translates into effective policy contributions, then into development, and subsequently into spillovers. Ironically, the whole point of providing policy advice is to know the instruments that have dedicated recourses, i.e. the input side. In principle, public expenditure data can ease this problem. This is why AI tools such as PPI are so important to produce reliable policy advice.

Going back to our main discussion, open fiscal data plays a crucial role for the SDG policy prioritization through AI. This shows the importance of making fiscal data publicly available, an endeavor that GIFT has pursued forcefully in the last decade. There is, however, an additional challenge that needs to be tackled: linking expenditure to SDGs. As we have explained, the output side of the development process consists of indicators, not expenditure programs. This translates into a mismatch between inputs and outputs, something that requires the attention of every government. The Mexican government, for example, assembled a team of specialists from the Treasury and from the United Nations Development Program to produce the first fiscal-SDG linked data in the world. This experience should provide guidelines to other countries that are serious in attempting to reach the 2030 Agenda.


The Future of Open Fiscal Data

A systematic effort to publish fiscal data and link it to development indicators is necessary if governments want to exploit AI for reliable policy advice. Nevertheless, there will still be other challenges ahead that need to be overcome. One of them is that, even if comprehensive fiscal-SDG linked data becomes available, we need to identify transformative expenditure. That is, just because we observe a substantial amount of resources in a specific policy issue, it does not mean that those resources correspond to transformative policies (those that improve indicators). The best example is the highway infrastructure of the industrialized nations. In these countries, extensive road networks have already been created, so most if the expenditure in this topic is dedicated to maintain them. Thus, indicators such as road coverage will not change as a result of this expense. In contrast, a developing country where new highways are being constructed is effectively devoting resources to produce a transformation, pushing the relevant indicator upwards. In one of our papers, we have found a related –and paradoxical– example in the topic of education. While developing countries have not prioritized this issue (which they urgently need to do), the developed nations keep investing to transform it (e.g., Finland is currently transforming its curricula in profound ways:

In conclusion, if governments are serious about meeting the SDGs by 2030, or any other future development agenda, they need to embrace what AI methods have to offer and combine them with development-indicator data. However, these efforts will eventually reach a bottleneck if governments only care about the output side of development. Thus, it is crucial to start building data for the input side, in particular, fiscal-SDG linked data. Only then, governments will be able to take advantage of the current technological revolution when designing and implementing policies.


For countries interested in applying for this model, who are committed to opening their public spending data and advance the link between the budget and SDGs, contact GIFT Coordination Team through



Omar A. Guerrero (@guerrero_oa) is a Senior Research Fellow at the Department of Economics in University College London and at The Alan Turing Institute, the UK’s national institute for data science and AI. He has been a fellow at the Oxford Martin School, the Saïd Business School and the Institute of New Economic Thinking at the University of Oxford. Currently, Omar works in the intersection of development economics and sustainability, trying to understand how policy priorities translate into effective development. For this, he employs techniques such as agent-computing models, machine learning and natural language processing.

Gonzalo Castañeda is Professor in Economics at the Center for Research and Teaching in Economics (CIDE), Mexico. He is also a member of the National Research System, level III (i.e., the top rank granted by the National Council of Science and Technology). Gonzalo works on Complex Adaptive System with issues related to economic development, and has a forthcoming book written in Spanish and English with the title “Social Complexity: An Innovative Approach for Understanding Socioeconomic Phenomena”.

Publishing Budget and Spending Open Data

    by Lorena Rivero del Paso (GIFT) and Oscar Montiel (Open Knowledge International)

Increasingly, we see examples where lack of transparency and accountability from governments affects trust. Being able to follow public money flows is an important step to recover trust and aim towards more effective governance of public funds. Despite this, according to the most recent edition of the Open Data Barometer, the number of national governments that publish their budget and spending reports and figures as data isn’t growing consistently.

Considering the lack of progress in such publication and the relevance of fiscal data, from Open Knowledge International (OKI) and Global Initiative for Fiscal Transparency (GIFT) we have partnered to support governments in the publication of budget and spending open data, through the Open Fiscal Data Package (OFDP). This specification allows publishers to structure their data in a way that makes its description and use as easy as possible and provides visualizations and developer tools for publishers and users with the OpenSpending platform.

As a follow up to the -Towards a schema for spending Open Data, Helpdesk included- blog post where you can read more in detail about the characteristics of the (OFDP), in this post we will guide you through, for a successful publication.

First: what information is expected in a budget/spending file?

Any budget and spending open dataset should have four basic components 1) fiscal year presented 2) budget classifications 3) source of funding and 4) amounts for each stage of the transaction. Additionally, the dataset can be complemented with other relevant data and classifications included in the Financial Management Information Systems (FMIS).

1) Period

  • Fiscal Year- The fiscal year is the framework in which the approved budget is executed. While some countries have already developed the annual budget within a multiyear perspective, through the preparation of medium-term fiscal and budget frameworks, these frameworks are usually established at a higher level of disaggregation than the annual budget and expenditures.

The dataset can include several past and future years according to data availability. For this, each fiscal year can be a separate dataset.

2) Budget classifications

Regarding the second component, budget classifications, we refer to those indicated in the Fiscal Transparency Code of the International Monetary Fund. Furthermore, for those countries who have progressed in incorporating program classification, it should be included as well. These are:

  • Administrative unit (government ministries/agencies and departments/divisions within agencies).
  • Economic type (“inputs” such as salaries, transfers, other non-salary current expenditures, capital spending) (following the international classification if available)
  • Functional and subfunctions (following the international classification if available)
  • Program/subprogram/activity/project; alternatively outcomes and/or outputs.

Some of these are cross-classifications, mainly in respect of the Program classification.

3) Source of funding

It serves to distinguish, in the financial statements, the origin of the domestically- financed expenditures (on-budget, extra-budgetary, counterpart fund), as well as from project aid financed expenditures.

4) Stages of the transaction

For the stage of the transaction, there is not a unique form of registration, but it is important to register clearly the type of data presented and which phase of the budget the user is looking at. The Expenditure Control: Key Features, Stages, and Actors[1] identify the next seven typical stages of the expenditure cycle, which should be considered for the dataset, according to the availability in the country:

  • Authorization of expenditure- A fundamental principle of public finance is that expenditure and revenue proposals must be legally authorized to ensure accountability.
  • Apportionment of authorization for specific periods and spending units- The purpose of apportionment is to prevent spending agencies from incurring obligations at a rate which would require the authorization of additional funds for the fiscal year in progress.
  • Reservation- Once the apportionment of expenditure authorization is made and the spending authority has been released, some countries’ Public Financial Management (PFM) systems include a stage at which funds are reserved for a specifically known expense but for which no contract has yet been issued. At this stage, there is no legal commitment, but it is known that the expense will be incurred during the budget year and, therefore, the reserved funds should not be used for other activities.
  • Commitment- The commitment stage is the point at which a potential future obligation to pay is established. A commitment occurs when a formal action, such as placing an order or awarding a contract, is taken that renders the government liable to pay at some time in the future when the order or contract is honoured by its counterpart.
  • Verification (or certification)- after goods have been delivered and/or services have been rendered by a supplier, an authorized officer within the spending unit concerned verifies their conformity with the contract or order, and that liability and due date of payment are recognized.
  • Payment order- Once checks are made to ensure that all previously stipulated controls have been performed and documented, a payment order is issued.
  • Payment- Once a payment order has been issued, payments are made through various instruments including checks, electronic funds transfer (EFT), and sometimes cash, in of a supplier or other recipient to discharge the liability. In line with internationally accepted good practice, the payment should be made through a Treasury Single Account (TSA) system- 

5) Additional data and classifications

  • Geographic classification-

A representation of which part of the country benefits from each of the government financial operations. This classification is difficult in most cases, so adaptations have taken the form of classifying by location of administrative units, taxpayers, recipients of government transfers, among others.

  • Investment projects-

Authorized public investment projects, including Public-Private Partnerships, if available. Data over these projects can be paired with geolocation by including its latitude and longitude. Furthermore, if more data is available in the Ministry of Finance systems, such as description or links cost-benefit analysis among others, we can analyze on a case to case basis.

  • Contracts-

Between 15 and 25 per cent of public expenditures are exercised through contracts. These data is useful data for complete traceability of the related part of the transactions. The number and detailed data of the procurement process and/or the awarded contract can be included as part of the file. If the Country has already implemented the Open Contracting Data Standard (OCDS), an additional column can be included with the Open Contacting ID (OCID), linking both data structures without the need for duplicating data (Learn more about the OCDS here). The information on contracts in a standardised form will also allow us to link beneficiaries of these contracts and where the money goes.

Second: Structure of the Dataset


The classifications should be disaggregated to the lowest level available so users can do a more specific use of the data.

Codes and descriptions

All of the classifications mentioned in the section above should always include one column for the code and one for the description for each level of disaggregation as displayed in the example below. These fields will allow the users to know what bits of the budget the dataset refers to. A clear data structure will allow the users to understand the different levels of the budget, how the programs, projects, etc are built and how money is allocated to them in the different stages of the budget.

Having these IDs and descriptions clear, will also allow mapping to the specification in a way that will later make it easy to visualise and navigate the data set once it is uploaded to the Open Fiscal Data Package.

The following image exemplifies the structure of ID + Description of the different levels of economic classification.

Horizontal structure

After analyzing use cases of the dataset and overseeing users interact with different structures, for the stages of the transaction it has been defined that each stage should be structured in one column (the other option is one column for all stages and only one column with amounts).

Third: Extension of the file

It’s common for budget or spending reports to be PDF files directly from the data. This might be easy to read for a human, but it’s not very easy to process by a computer. This is why data uploaded to OpenSpending should be produced or saved in a comma separated value (CSV) file. A CSV is the simplest machine-readable file that requires no special software to be used, like XLS that requires Excel or similar programs. It’s very light and allows you to use software that your computer may already have to start working with it. A CSV can be managed with scripts but it’s also very friendly for beginners to navigate, filter and modify without specific knowledge about databases.

You can see an example of the data from Mexico’s Federal Government below.

We have also prepared a data template where you can see the classifications and other data that will ensure quality data being published.

Fourth: Uploading the dataset using OFDP?

There are two ways of getting your budget data into OpenSpending. Both are available to everyone and can be tried today. We will discuss the two options and then compare under which circumstances you can best use which approach.

1. Upload directly with OS Packager

If you have a budget file already available as a tabular file in CSV form, you have everything you need to start using the Packager. You just need to create an account here and you will be able to start uploading the data. If you have already published data using CKAN or any other open data platform, you can link directly to the CSV and start working with that data.

The packager tool divides the publication task into 4 steps:

  1. Data upload
  2. Data mapping/description
  3. Metadata input
  4. Data use

Each of these steps will guide you and in the end, you will have a data package. That is, a CSV file with the fields you originally had, as well as a JSON file with the description of these fields, mapped to the specification. This can be directly used in the OpenSpending Viewer.

2. Set up an Open Spending pipeline

While moving towards a more timely and disaggregated publication that would in turn be more useful for the users, there are different kinds of needs than for one time publishers. For example, to publish time series of spending we need all years spending data merged into one dataset, with this the size of the complete dataset will also be bigger. In these cases of governments that have progressed to a more advanced publication, we can use a pipeline.

A pipeline is basically a set of instructions that we provide to map the data to the specification while keeping some of the nuances of our data. This process implies writing pieces of code to perform data processing and loading to selected endpoints. This option will give a more flexible publication but would require to define the best approach along with the Helpdesk.

Which option is best for you?

There is no unique answer to this, but there are a few questions that might help guide the initial decision to begin publishing using OpenSpending.

For example,

  • Has your government implemented a Financial Management Integrated System in which budget and spending are registered?
  • Are budget and spending data stored in different systems?
  • Is there any human intervention to consolidate the budget and/or spending data?
  • How often does the government present spending reports to the legislature? Does the system allow this periodic data extraction?
  • Does the country have historical data on budget and spending? How far in time is it available either in systems or in stored files?
  • Do you follow any standardised publication patterns at the moment?

If you’re interested in pushing for better publication, please contact us, we’re happy to help. Our Helpdesk can be reached at




[1] Pattanayak, Sailendra. Expenditure Control: Key Features, Stages, and Actors Prepared by Sailendra Pattanayak Fiscal Affairs Department. International Monetary Fund, 2016.

Towards a schema for spending Open Data, Helpdesk included

        by Lorena Rivero del Paso (GIFT) and Oscar Montiel (Open Knowledge International)

Having data for budgets and spending can allow us to track public money flows in our communities. It can give us insights into how governments plan and focus on programmes, public works, and services. So the Global Initiative for Fiscal Transparency (GIFT), along with Open Knowledge International (OKI), have been working on new tools to make this data more useful and easier to understand.

Two of these are the Open Fiscal Data Package (OFDP) specification and OpenSpending platform.

The OFDP is a data specification that allows publishers to create a literal package of data. This package includes fiscal data mapped onto either standardised or bespoke functional, economic and administrative classifications. Additionally, the different stages of the budget can be mapped, and other fields that are relevant to the publisher. This seeks to reduce the barriers to accessing and interpreting fiscal open data.

One of the main benefits of the OFDP is that data publishers can adopt it no matter how they generate their databases. The flexibility of this specification allows publishers to improve the quality incrementally. There is no need to develop new software. Having this structured data allows us to build tools and services over it for visualization, analysis or comparison.

The second tool is actually a set of tools called OpenSpending.

This is an open-source and a community-driven project. It reflects the valuable contributions of an active, passionate and committed community.

OpenSpending enables analysis, dissemination, and debates for more efficient budgets and public spending. It allows anyone to create, use, and visualize fiscal data using the Open Fiscal Data Package in a centralized place with small effort.

As part of this collaboration, OKI and GIFT have been working with different government partners to publish using OFDP. But we want to see the adoption of the Open Fiscal Data Package grow even more. This is why we have set up the Fiscal Data Helpdesk to help you in the publication process!

How to engage with the Fiscal Data Helpdesk

Maybe you are already publishing fiscal data through an open data portal? Or maybe you have a platform and want to make it more useful for a larger number of users? Perhaps you have heard about standardization but it sounds complex and you think it might not be for your office? The Helpdesk is around to answer all your questions and support you through the process of getting data up and running in OpenSpending.

There are a few good examples of what we want you to get doing. We’ve worked with the Mexican federal government to publish their data from 2008 to 2019 using the OFDP and OpenSpending to make it easier to access. You can navigate their data here.

We’ve also worked to get datasets from many countries in the World bank BOOST initiative on OpenSpending. Currently, there are data from countries like Burkina Faso, Guatemala, Paraguay, and Uruguay.

In the coming weeks, we will publish some resources and a series of blog posts to give you more information about publishing your data in OFDP and using OpenSpending.

Interested? You can visit the OFDP section or send us an email at We will get back to you to help get your budgets out in the open!