The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

The NiPN data quality toolkit

Introduction

This document presents a set of practical analytical methods that can be applied to variables in datasets to assess their quality. An index of data quality that both describes and scores the quality of the data is also presented.

The focus of this toolkit is on data required to assess anthropometric status such as measurements of weight, height or length, MUAC, sex and age.

The focus is on anthropometric status but many of presented methods could be applied to other variables. NiPN may commission additional toolkits to examine other variables or other types of variables.

Data quality is assessed by:

These activities and a proposed order in which they should be performed are shown in the figure below.

NiPN data quality workflow

NiPN data quality workflow

The material is intended to provide a practical or “hands on” introduction to assessing data quality and is presented as a series of computer-based exercises. Example datasets are provided.

Extensive use is made of the R language and environment for statistical computing. This is a free and powerful data analysis system. Methods have been described in sufficient detail to allow activities to be performed using other data analysis systems.

R provides a very extensive language for working with data. The material presented here has been written using only a small subset of the R language. Many of the data quality activities are supported by R functions that have been written specifically for this purpose. These simplify the assessment of the quality of data related to anthropometry and anthropometric indices. The basic R functions, the purpose written functions, and the filenames of example datasets are also shown in the figure above.

The purpose written functions are described in detail here.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.