Code Modules: Maximising the Impact of IDI Data in Aotearoa

Launched in 2022, the code modules initiative aims to provide researchers with consistent, reliable and easy to use datasets sourced from Aotearoa’s Integrated Data Infrastructure (IDI). The IDI is a large research database that allows users to access rich, whole-of-system insights about people and households within Aotearoa.

As expert users of the IDI, we know that making it easier for researchers to create high quality, consistent and up-to-date datasets with IDI data will lead to more robust policy and research – and ultimately, better outcomes for all New Zealanders.

Nicholson Consulting is proud to have been a part of this initiative since its inception. Over the last couple of years, we have worked closely with Stats NZ, MBIE and the wider IDI research community to develop several of the modules that are now available for use. 

To learn more about code modules and how you can use them in your mahi, keep reading! 

What are code modules? 

Code modules are high quality data tools that query IDI data and transform it into meaningful, reliable and ready to use datasets for foundational measures needed by researchers – such as benefit receipt, migration status, income, or school attendance. 

Each module provides all the necessary code to create these datasets along with information needed to use them safely covering things like business rules, data exclusions, inclusions, transformations, and filters.

Code modules are updated every time the source data is in the IDI to ensure they always provide the most up-to-date outputs possible.

Why are code modules useful?

While the IDI is a powerful research tool, there are some common pain points that create difficulties for researchers when working with the IDI.

For instance, researchers often extract the same, commonly used datasets from the IDI - e.g. school attendance, employment rates etc. Not only does this lead to a duplication of time and effort, but also leads to variances across datasets as researchers apply different business rules to create these datasets. This means that the final numbers could differ depending on who the researcher is and can make it difficult for researchers to compare findings with other researchers, or track trends over time.

Additionally, researchers require data and programming expertise to navigate the IDI, creating a barrier to entry for those without these skills. Even for those with the technical capability, extracting data from the IDI can require a lot of time and effort.

Code modules help address these challenges, and provide some key benefits such as:

Improved consistency and quality of data

Code modules provide researchers with tested, consistent, easily reproducible datasets that come with detailed business rules already applied. This makes it easier for researchers to compare outputs with other researchers working in the same domain, and monitor changes over time – ultimately leading to better quality research.

Reducing technical barriers to entry

Code modules make IDI data more accessible to researchers with limited data or programming expertise. While a basic command of SQL is still required to run a code module, the need to be an expert in that area to reliably extract insights is greatly reduced.

Improved efficiency

Code modules help reduce unnecessary duplication of time and effort, by providing all the necessary business rules and code to create commonly used datasets.

How are code modules developed? 

Code modules are selected for development based on what data will be most useful to IDI researchers across multiple research areas.   

Each code module has a steward from a relevant government agency, who is an expert in the IDI source data. These stewards help define business rules and terminology. In addition, there is a community of IDI data experts who advise how the data can be used and what documentation should be provided for each module.   

The whole initiative is currently overseen by a group of agencies including Stats NZ, MBIE, MSD, SIA, OT, TWO, MoH, and MOE. Nicholson Consulting works alongside other organisations, government agencies, subject matter and IDI experts within the code module community to bring this initiative to life.  

As part of the initiative, some of the ways the Nicholson Consulting team contributes to the development of code modules include: 

  • Data domain expertise

  • Writing code 

  • Testing code 

  • Preparing documentation  

  • Deploying code modules 

  • Engagement with the IDI community

  • Training other IDI researchers to produce modules

Current code modules available for use 

There are currently 35 modules available to be used in IDI projects.  

New code modules can be deployed at any time. However, most modules are released as part of the BAU IDI refresh process once every four months. Nicholson Consulting has been heavily involved with many of the already deployed code modules as well as the following two modules released with the October 2024 IDI refresh:

  • Social housing which can be used for research into met and unmet needs for social housing. This module can also be used to estimate the average time spent on social housing waiting lists.

  • ASH and PAH which defines publicly funded Ambulatory Sensitive Care Hospitalisation (ASH) and Potentially Avoidable Hospitalisation (PAH) events.

These join the other code modules available to be used in IDI projects.

  • Driver licence status and endorsements which are two separate modules, one listing whether a licence is current or with restriction and the second listing licence endorsements. 

  • Employment spells which can be used to produce estimates of people who are employed at a given time. 

  • Total Income which is a collection of 10 modules that construct tables of a variety of personal income types. 

  • MSD Employment assistance which creates spells of participation in some employment and training programmes. 

  • MSD Income support payments which creates spells with main and supplementary benefit details as well as tax credits. 

  • Migration spells which creates spells for non-New Zealand citizens entering New Zealand and visas held during their stay. 

  • Educational attainment which covers qualifications attained in secondary and tertiary education and the highest qualification a de-identified individual has received. 

  • School attendance which gives details of when children are present or absent from school or early childhood education. 

  • Total Income which is a collection of 10 modules that construct tables of a variety of personal income types. 

  • Household Labour Force Survey (HLFS) summary tools which is a suite of resources to help researchers use the HLFS. 

How can I use code modules? 

If you would like to use code modules in your mahi, you can learn more through the Stats NZ website here: https://www.stats.govt.nz/integrated-data/code-modules-initiative/  

The code modules initiative is still growing, with more topics being added to the existing suite of code modules on a regular basis. If you are a researcher and believe that code modules may be useful in your area of research, reach out to the code modules team via the link above.

We are excited to be a part of this initiative and to have the opportunity to share our knowledge with the wider IDI community. We believe that through making IDI data more accessible for researchers, this initiative will contribute to more robust policy and research, and ultimately lead to better outcomes for all New Zealanders. 

If you would like to know more about our work with code modules, or how we can help you work with the IDI, reach out for a kaputī at hello@nicholsonconsulting.co.nz

Next
Next

Te Kura Kaupapa Māori o Ngā Mokopuna: A Whānau & Community Centred Approach To te reo Māori Revitalisation