Assessing Identity Disclosure Risk in the Absence of Identified Datasets in the Public Domain

Peter N. Muturi; Andrew M. Kahonge; Christopher K. Chepken

Assessing Identity Disclosure Risk in the Absence of Identified Datasets in the Public Domain

dc.contributor.author	Peter N. Muturi
dc.contributor.author	Andrew M. Kahonge
dc.contributor.author	Christopher K. Chepken
dc.date.accessioned	2024-11-17T17:51:53Z
dc.date.available	2024-11-17T17:51:53Z
dc.date.issued	2024-07-17
dc.description.abstract	Data release is essential in supporting data analytics and secondary data analyses. However, data curators need to ensure the released datasets preserve data subjects’ privacy and retain analytical utility. Data privacy is achieved through the anonymisation of datasets before release.The risk of disclosure posed to the dataset should inform the level of anonymisation to be undertaken. As anonymisation achieves data privacy, it reduces the analytical utility of the dataset by introducing alterations to the original data values. Therefore, data curators require an appropriate estimate of the dataset’s identity disclosure risk to inform the required anonymisation that balances privacy and utility. The disclosure risk varies from one geographical region to another due to varying enabling factors. This paper assesses the disclosure risk and the enabling factors in an environment lacking identified datasets in the public domain. This study used a quasi-experimental design in carrying out an empirical identity disclosure test, where respondents were given an anonymised dataset and were required to disclose the identity of any of the records. The findings were that background knowledge of the released datasets was the primary enabler in the absence of identified datasets. Respondents could only disclose records in the dataset they had familiarity with. However, the disclosure risk was within an acceptable threshold. Therefore, the study concluded that in an environment lacking identified datasets in the public domain, reasonable anonymisation could achieve a balance of privacy and utility in datasets. The findings justify private data release able to support data analytics and secondary data analyses in environments lacking identified datasets in the public domain.
dc.identifier.uri	https://erepository.ouk.ac.ke/handle/123456789/1484
dc.publisher	East African Journal of Information Technology
dc.title	Assessing Identity Disclosure Risk in the Absence of Identified Datasets in the Public Domain

Files

Original bundle

Now showing 1 - 1 of 1

Name:: document..pdf
Size:: 376.93 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

Journal Articles