Assessing Identity Disclosure Risk in the Absence of Identified Datasets in the Public Domain

dc.contributor.authorPeter N. Muturi
dc.contributor.authorAndrew M. Kahonge
dc.contributor.authorChristopher K. Chepken
dc.date.accessioned2024-11-17T17:51:53Z
dc.date.available2024-11-17T17:51:53Z
dc.date.issued2024-07-17
dc.description.abstractData release is essential in supporting data analytics and secondary data analyses. However, data curators need to ensure the released datasets preserve data subjects’ privacy and retain analytical utility. Data privacy is achieved through the anonymisation of datasets before release.The risk of disclosure posed to the dataset should inform the level of anonymisation to be undertaken. As anonymisation achieves data privacy, it reduces the analytical utility of the dataset by introducing alterations to the original data values. Therefore, data curators require an appropriate estimate of the dataset’s identity disclosure risk to inform the required anonymisation that balances privacy and utility. The disclosure risk varies from one geographical region to another due to varying enabling factors. This paper assesses the disclosure risk and the enabling factors in an environment lacking identified datasets in the public domain. This study used a quasi-experimental design in carrying out an empirical identity disclosure test, where respondents were given an anonymised dataset and were required to disclose the identity of any of the records. The findings were that background knowledge of the released datasets was the primary enabler in the absence of identified datasets. Respondents could only disclose records in the dataset they had familiarity with. However, the disclosure risk was within an acceptable threshold. Therefore, the study concluded that in an environment lacking identified datasets in the public domain, reasonable anonymisation could achieve a balance of privacy and utility in datasets. The findings justify private data release able to support data analytics and secondary data analyses in environments lacking identified datasets in the public domain.
dc.identifier.urihttps://erepository.ouk.ac.ke/handle/123456789/1484
dc.publisherEast African Journal of Information Technology
dc.titleAssessing Identity Disclosure Risk in the Absence of Identified Datasets in the Public Domain
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
document..pdf
Size:
376.93 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description:
Collections