The Longitudinal Research Platform
-
Agincourt Health and Socio-Demographic Surveillance System (HDSS)
Established in 1992, the Agincourt Health and Socio-Demographic Surveillance System (HDSS) covers a population of about 120,000 people residing in 21,000 households in 31 villages in the Bushbuckridge sub-district, Mpumalanga province. The HDSS database contains data from the comprehensive coverage of demographic events within this geographically defined population over some 30 years, coupled with health and socio-economic information.
We track the dynamic lives of communities through a rich set of measured variables, including pregnancy outcomes, deaths, migration patterns, household relationships, nationality, marital status, and education. Every recorded death undergoes a verbal autopsy within a year, gathering insights from the closest caregiver to determine probable cause.
Our commitment to data integrity is unwavering—rigorous quality checks, duplicate surveys, and built-in consistency validations ensure reliability. All data feed into a powerful longitudinal relational database, capturing the full life course of every individual ever recorded in our study area. This meticulous approach fuels groundbreaking research and real-world impact.
The multi-year longitudinal database contributes substantial scientific output, including trend data, in its own right. The HDSS is a robust data resource, serving as a powerful and versatile sampling frame for observational and intervention studies, as well as policy evaluations.
-
Clinic and Hospital Link
The “Agincourt HDSS-Clinic-Hospital link” is an ongoing longitudinal data platform that captures patient information from primary healthcare facilities in the Agincourt study area and two nearby district hospitals. This information is then linked in real-time to the Agincourt population database. Linkage of these data from eight primary care facilities (since 2014) and two hospitals (since 2020) to the health and demographic surveillance system (HDSS) enables monitoring of the burden and impact of various illnesses, coverage, and what healthcare services are being used. The data then allows researchers and local/district health managers to obtain much-improved statistics on the utilisation of services, integrated chronic care, and the burden/impact of key conditions, including co-morbidities in a transitioning society.
How it works
We use both deterministic and probabilistic methods to link patient records securely. When a new patient visits a clinic, a data clerk obtains written consent to collect and link their clinical data to the Agincourt population system.
- The clerk searches the HDSS database using a national ID number.
- If no match is found, they try a combination of mobile number, first name, and birthdate.
- If needed, an advanced probabilistic algorithm (Fellegi-Sunter model) helps identify potential matches, which are verified with the patient.
Once linked, clinical data is extracted, and follow-up visits are logged. Strict privacy safeguards ensure secure data storage and encryption. These linked records support critical research and trial evaluations, enhancing healthcare insights.
-
Excess Mortality
HDSS-based mortality surveillance in Africa and South Asia
Following the declaration of COVID-19 as a pandemic by the World Health Organization, there have been high levels of reported deaths, at least in countries with functioning civil registration and vital statistics (CRVS). These largely under-represent the true mortality figures owing to COVID-19.
What is the impact of COVID-19 on actual mortality?
We want to find out the scale of excess deaths and the population sub-groups most affected, particularly in low- and middle-income settings. Constructing a true representation of COVID-19 deaths can be useful for social policies and future pandemic preparedness planning. The goal of this initiative is to characterise all-cause mortality rates and trends, by age and sex, across a range of rural and urban sub-Saharan African and South Asian settings under continuous health and demographic surveillance.
This a multinational initiative bringing together 17 sites/centres from Africa and South Asia. Chodziwadziwa Kabudula and Stephen Tollman of the SAMRC/Wits-Agincourt Research Unit, with Kobus Herbst of the South African Population Research Infrastructure Network (SAPRIN) and Beth Tippett-Barr of Nyanja Health Research Institute, have co-hosted three face-to-face workshops to date: in March and November 2022, and July 2023. The goal of these workshops has been to strengthen capacity in data management, analysis, scientific writing and dissemination, and so better understand the impact of COVID-19 on excess mortality in African and South Asian settings.
- PI(s): Stephen Tollman, Kathleen Kahn
- Co-investigators: Chodziwadziwa Kabudula, Kobus Herbst
- Programme Manager: Tshegofatso Seabi
- Funder: Bill and Melinda Gates Foundation, USA
- Collaborating Institutions: SAPRIN, African Health Research Institute, DIMAMO Population Health Research Centre, and Wits-VIDA Soweto CHAMPS (SA); Manhica Health Research Centre (Mozambique); African Population and Health Research Centre, Kaloleni/Rabai, Siaya-Karemo, Manyatta (Kenya); Iganga-Mayuge (Uganda); Magu (Tanzania); Nyanja Health Research Institute (Malawi); Kersa (Ethiopia); Navronga (Ghana); Nanora (Burkina Faso); Matlab, Dhaka, Chakaria (Bangladesh); Vadu (India)
-
MADIVA
Multimorbidity in Africa: Digital Innovation, visualisation and application (MADIVA)
The MADIVA Research Hub is dedicated to creating data science methods and solutions to address multimorbidity challenges in Africa, where co-occurring diseases contribute significantly to the health burden. Our primary research sites, located in rural Bushbuckridge, South Africa, and urban Nairobi, Kenya, possess extensive longitudinal data collected through health and demographic surveillance systems. We also have emerging clinical health records and genomic data.
We develop and apply data science techniques to link different datasets, create dashboards for stakeholders, and use advanced machine learning to assess disease risk in different groups. This includes leveraging polygenic risk scores (PRSs).
(A PRS is a number that estimates a person's genetic susceptibility to developing a particular disease based on the combined effect of many small genetic variations across their genome.)
DMAC (Data Management and Analysis Core):
Led by Chodziwadziwa Kabudula, DMAC manages and analyses complex multimorbidity data in the MADIVA research hub. It provides tools, expertise, and infrastructure to integrate and visualise diverse datasets, helping improve healthcare in sub-Saharan Africa.TCDPC (Training, Capacity Development and Pilot Core):
The TCDPC supports pilot projects and trains new researchers in the MADIVA research hub. It focuses on early career development, capacity building, and mentorship to strengthen healthcare systems in sub-Saharan Africa.- Project 1:
Led by Xavier Gómez-Olivé, this project uses data from Nairobi and rural South Africa to improve health system responses to multimorbidity through data integration and visualisation. - Project 2:
Project 2 improves disease prediction models using genetic and long-term health data from African communities. It combines traditional risk algorithms with machine learning to identify high-risk groups, guide interventions, and support healthcare decisions.
- PI(s): Scott Hazelhurst (PI of record), Stephen Tollman, Michele Ramsay, Catherine Kyobutungi
- Project Manager: Kerry Glover
- Funder: NIH, USA
- Collaborating Institutions: IBM Research Africa, DS-I Africa – Data Science for Health Discovery and Innovation in Africa and SAPRIN (SA); African Population and Health Research Center (Kenya); Vanderbilt University Medical Center (USA).
- Website: https://www.madiva.africa/
- Project 1:
-
Minimally Invasive Tissue Sampling (MITS)
Integrating minimally invasive tissue sampling into established community-based mortality surveillance in rural South Africa: contributing to strengthening verbal autopsy
Understanding who dies from what is key to improving healthcare and shaping effective policies. In rural South Africa, this challenge is even greater—autopsy services are scarce, over half of deaths happen outside hospitals, and disease patterns are rapidly shifting.
Verbal autopsy (VA) methods have helped estimate causes of death, but certain conditions with vague symptoms remain difficult to diagnose. Enhancing VA with minimally invasive tissue sampling could significantly improve accuracy. By refining these methods, we can gain deeper insights into mortality patterns and build stronger, more responsive health systems.
As a new member of the Minimally Invasive Tissue Sampling (MITS) Alliance, we're pioneering new ways to improve mortality surveillance by integrating minimally invasive tissue sampling (MITS) with verbal autopsy (VA). Our work includes:
- Piloting MITS as a routine method to enhance population-based death investigations in conjunction with the VA method.
- Comparing MITS vs VA across key groups—newborns, children, maternal deaths, and adults—to assess accuracy.
- Expanding MITS research to explore adult deaths, test cutting-edge technology, and compare rural vs. urban mortality patterns.
By leading this effort, we aim to strengthen global MITS surveillance and drive better public health responses.
- Co-PIs: Kathleen Kahn, Ryan Wagner
- Project Manager: Lucky Mondlane
- Funder: The Bill & Melinda Gates Foundation through the MITS Alliance, USA
- Collaborating Institutions: The MITS Alliance, and The Ohio State University (USA); Wits Department of Pathology (SA)