Home > Repository of Information and Tools > Data Collection and Processing

Data Collection and Processing

This section principally presents material supporting design and conduct of data (questionnaire, physical measures, etc.) collections and processing.

Conceptualization

Please visit the following section:

Design and Conduct

Questionnaires

In Theory

Data classification index and general documentation on collection of epidemiological data.

Please visit also the following section(s):

Consensus measures for phenotype and exposure (PhenX) External
This website aims to contribute to the integration of genetics and epidemiologic research. A recommended minimal set of high priority measures for use in Genome-wide Association Studies (GWAS) and other large-scale genomic research effort are presented in the PhenX toolkitExternal under different domains. Standard procedure protocols are provided for each measure in the toolkit. PhenX is led by RTI International and funded by the National Human Genome Research Institute (NHGRI).

Recommendation for indicators, international collaboration, protocol and manual of operations for chronic disease risk factor surveysPDF
Part II of this document deals with issues related to defining every step of the survey organization, from the target population to the recruitment per se. Part III contains protocols and operational guidelines for individual measurements and biological samples collection. A European Health Risk Monitoring-written document, the Copyright is reserved for the Finnish National Public Health Institute (2002).

Comparison Chart of Guidelines Comparison Chart of Guidelines
This document compares selected reference guidelines written by well-known organizations. These guidelines cover all biobanking steps, i.e. biological sample collection, labelling, processing, and storage applied to a wide range of sample types. A P3G-prepared document (2008).

Classification indexes

Anatomical Therapeutic Chemical Classification System (ATC/DDD Index 2006)External
This website gives access to information about, and documents refering to, the classification of drugs. There is an important focus on adopting a standardized approach based on the Anatomical Therapeutic Chemical (ATC) classification and on the Defined Daily Dose (DDD), a technical unit of measurement for drug utilisation studies. A World Health Organization Collaborating Centre for Drug Statistics Methodology website. The website is situated at the Norwegian Institute of Public Health.

Diagnostic and Statistical Manual of Mental Disorders (DSM-IV-TR)External
This website contains the handbook for mental health professionals listing different categories of mental disorder and the criteria for their diagnoses. Published by the American Psychiatric Association and presented on the BehaveNet®Inc website.

International Classification of Diseases (ICD)External
This website presents the last version of the ICD-10, a detailed description of known diseases and injuries. Every disease (or group of related diseases) is described together with its diagnosis and its unique code. A World Health Organization classification.

International Standard Classification of Occupations (ISCO)External
This website presents the classification structure for organizing jobs into a defined set of groups according to the tasks and duties undertaken in the job. Developed by the International Labour Organization.

ISCED97PDF
This document presents the International Standard Classification of Education (ISCED), a framework for the compilation and presentation of national and international education statistics and indicators. It contains standard concepts, definitions and classifications. Designed by UNESCO Institute of Statistics (1997).

National Occupational Classification (NOC)External
This website provides a complete listing of all the categories under which Canadian jobs are classified and their description. A Statistics Canada-maintained website.

Standard Occupational Classification (SOC)External
This website details a standardized system for classifying occupations in the United Kingdom. An Office of National Statistics-maintained website.

International Classification of Functioning, Disability and Health (ICF)External
This website details a standardized system for classifying of health and health related domains that describe body functions and structures, activities and participation. The domains are classified from body, individual and societal perspectives. Since an individual's functioning and disability occurs in a context, ICF also includes a list of environmental factors. ICF is a new member of WHO Family of International Classifications.

In Practice

Real-life examples related to collection of epidemiological data.

Please visit the following sections:

Automated Self-administered 24-hour Dietary Recall (ASA24)External
ASA24 is a software tool that enables automated and self-administered 24-hour dietary recalls. ASA24 consists of a Respondent Web site used by study participants to complete recalls and a Researcher Web site used by researchers to manage study logistics. This software has been developped by NCI in collaboration with USDA investigators and under contract with Archimage in Houston, TX, and Westat in Rockville, MD (2005).

Database of Genotypes and Phenotypes (dbGaP) External
This website lists studies investigating the interaction of genotype and phenotype. Open access data includes a study description, phenotypic variables of interest, study documents (questionnaire, standard operating procedures, consent forms, etc), and genotype-phenotype analyses. The database can be searched by studies, disease, or keywords and access to datasets can be requested. This website is maintained by the National Center for Biotechnology Information (NCBI). (2006).

UK Biobank: Report of the integrated pilot phasePDF
This document reports on, and provides recommendations based on, the results obtained from the pilot phase of UK Biobank. The pilot phase was conducted between February and June 2006, in order to assess the basic design of the study and the approaches to data and sample collection. Prepared by UK Biobank (2006).

UK Biobank: Protocol for a large-scale prospective epidemiological resourcePDF
Part 1 of this document outlines the scientific rationale for constructing a very large cohort-based biobank in middle aged adults in the UK. It also considers the basic design of such a study, reviewing and justifying decisions about baseline quetionnaires, physical measures and biological samples. Prepared by UK biobank (2007).

National Health and Nutrition Examination Survey (NHANES)External
This website of the American survey displays all serum/plasma collection and biochemistry procedures, from cholesterol to heavy metals, electrolytes, metabolic enzymes. It also provides among others questionnaire administration methods and individual physical and cognitive measurements US Department of Health and Human Services-maintained website (2006).

The National Children's StudyExternal
This section of the website of the National Children's Study (NCS) displays the "Reviews and Analytic Reports". It offers access to a wide variety of literature reviews and "white papers" on study design and conduct relevant to the construction of a large (approx 100 000 live births) national birth cohort. Other sections of the website deal with issues specific to the NCS itself (e.g. study hypotheses). The website is cosupported by the U.S. Environmental Protection Agency, U.S. Department of Health and Human Services and USA.gov Agency.

Questionnaire ModulesExternal
This section of the website is a collection of research questionnaires intended to serve only as models and be adapted by researchers to suit their particular needs. This tool is provided by the National Cancer Institute's Division of Cancer Epidemiology and Genetics.

Comparison Chart of Key Health Organizations/Projects Involved in Biobanking Comparison Chart of Key Health Organizations/Projects Involved in Biobanking
This document presents selected Key Health Organizations/Projects involved in Biobanking. For each organization/project, the table addresses the general mission and indicates if it is particularly involved in the four following domains: (1) Guidelines and scientific tools creation to support biobanks, (2) Networking and harmonization, (3) Scientific Discoveries and (4) Biobanking. A P3G-prepared document (2010).
Comparison Chart of Mental Disorders Instruments Mental Disorders Assessment Comparison Chart
This document compares selected reference Mental Disorders instruments. For each instruments, the table addresses the main fields of comparison, i.e. authors, objectives, domains covered, population targeted, time recall, number of items, administration mode, available translations, copyright, condition of use and references. A P3G-prepared document (2009).
Comparison Chart of Social Support Instruments Social Support Assessment Comparison Chart
This document compares selected reference social support instruments. For each instrument, the table addresses the main fields of comparison, i.e. authors, objectives, domains covered, population targeted, time recall, number of items, administration mode, available translations, copyrights, condition of use and references. A P3G-prepared document (2009).
Comparison Chart of Nutrition Instruments Nutrition Assessment Comparison Chart
This document compares the different approaches for dietary assessments. For each approach, the table addresses the main fields of comparison, i.e. objectives, quantitative/semi-quantitative approaches, time recall, strengths/weaknesses, number of items, administration mode, references and references of studies using this approach. A P3G-prepared document (2009).
Comparison Chart of Physical Activity Instruments Physical Activity Assessment Comparison Chart
This document compares selected reference physical activity instruments developed to enable comparisons across culturally diverse populations. For each instrument, the table addresses the main fields of comparison, i.e. authors, objectives, domains covered, population targeted, time recall, number of items, administration mode, available translations, copyrights, condition of use and references. A P3G-prepared document (2008).
Comparison Chart of Physical Fitness Assessments Physical Fitness Assessment Comparison Chart
This document introduces the main tests which can be used in the assessment of physical fitness and compares their representation among a reference studies/organisms panel. For each test, the objective and a brief description are given. A P3G-prepared document (2009).
Comparison Chart of Health-Related Quality of Life (QoL) Instruments Health-Related Quality of Life (QoL) Assessment Comparison Chart
This document compares selected reference health-related quality of life instruments. For each instrument, the table addresses the main fields of comparison, i.e. authors, objectives, domains covered, population targeted, time recall, number of items, administration mode, available translations, copyrights, condition of use and references. A P3G-prepared document (2008).

HuGE NavigatorExternal
This website provides access to a continuously updated knowledge base in human genome epidemiology through three main section. The HuGEpedia, including PhenopediaExternal and GenopediaExternal, is an encyclopedia of human genetic variation in health and disease. HuGEtools allow users to search and mine the literature in human genome epidemiology ( HuGE Literature FinderExternal, GWAS IntegratorExternal, HuGE Investigator BrowserExternal, Gene ProspectorExternal, Genotype Prevalence CatalogExternal, HuGE WatchExternal, Variant Name MapperExternal, HuGE Risk TranslatorExternal ). A series of HuGE related informatics utilities and projects such as GAPscreenerExternal, HuGE TrackExternal, and Open SourceExternal are available in the HuGEmix section. Website developed and maintained by the Human Genome Epidemiology Network (HuGENet™)(2009).

IT Tools

Selected software for data entry.

Epi Info™External
Epi Info™ is a versatile software which was developed for work at the interface between epidemiology and public health. It can be used to quickly develop a simple questionnaire or form, to customize the data entry process or to undertake basic analyses. The emphasis is on speed and ease of use, it is not a platform for sophisticated modelling. The software and website are now maintained by the Centers for Disease Control and Prevention (CDC) in the United States.

PEDSYSExternal
PEDSYS is a database system developed to provide a software environment in which to manage and analyse genetic and demographic data, particularly in relation to pedigree (family-based) data. The system supports integrated collection, management and analysis of constantly evolving data sets. The software and website are supported by the Southwest Foundation For Biomedical Research.

Progeny AnywhereExternal
Pregeny Anywhere is a pedigree drawing component (ActiveX Control or JavaBean). It allows to produce pedigrees directly or from downloaded information. It is a free online tool from Progeny Software.

Physical and Cognitive Measures

In Theory

General documentation and guidelines regarding collection of physical and cognitive measures.

Please visit also the following section(s):

Consensus measures for phenotype and exposure (PhenX) External
This website aims to contribute to the integration of genetics and epidemiologic research. A recommended minimal set of high priority measures for use in Genome-wide Association Studies (GWAS) and other large-scale genomic research effort are presented in the PhenX toolkitExternal under different domains. Standard procedure protocols are provided for each measure in the toolkit. PhenX is led by RTI International and funded by the National Human Genome Research Institute (NHGRI).

Recommendation for indicators, international collaboration, protocol and manual of operations for chronic disease risk factor surveysPDF
Part III of this document contains protocols and operational guidelines for individual measurements. A European Health Risk Monitoring-written document, the Copyright is reserved for the Finnish National Public Health Institute (2002).

Comparison Chart of Guidelines Comparison Chart of Guidelines
This document compares selected reference guidelines written by well-known organizations. These guidelines cover all biobanking steps, i.e. biological sample collection, labelling, processing, and storage applied to a wide range of sample types. A P3G-prepared document (2008).

In Practice

Real-life examples related collection of physical and cognitive measures.

Please visit the following section:

Database of Genotypes and Phenotypes (dbGaP) External
This website lists studies investigating the interaction of genotype and phenotype. Open access data includes a study description, phenotypic variables of interest, study documents (questionnaire, standard operating procedures, consent forms, etc), and genotype-phenotype analyses. The database can be searched by studies, disease, or keywords and access to datasets can be requested. This website is maintained by the National Center for Biotechnology Information (NCBI). (2006).

Atherosclerosis Risk in Communities (ARIC)External
This is the website of the American ARIC study. It displays biochemical procedures such as Blood Collection and Processing, Lipid and Lipoprotein Determinations, Hemostasis Determination and Clinical Chemistry Determinations that were used during the recruitment and the follow up. A Collaborative Studies Coordinating Center-maintained website, based at the University of North Carolina.

National Health and Nutrition Examination Survey (NHANES)External
This website of the American survey displays all serum/plasma collection and biochemistry procedures, from cholesterol to heavy metals, electrolytes, metabolic enzymes. It also provides among others questionnaire administration methods and individual physical and cognitive measurements US Department of Health and Human Services-maintained website (2006).

Analysis

Please visit the following section:

Dissemination

In Theory

NHMRC Biobanks Information PaperPDF
The aim of this paper is to provide information relevant for the establishment, management and governance of biobanks in Australia. National and international documentation, as well as other relevant case studies are used to identify best practices with regard to standardization of biobank policies, practices and procedures. Prepared by the NHMRC (2010).

In Practice

Please visit the following section:

Database of Genotypes and Phenotypes (dbGaP) External
This website lists studies investigating the interaction of genotype and phenotype. Open access data includes a study description, phenotypic variables of interest, study documents (questionnaire, standard operating procedures, consent forms, etc), and genotype-phenotype analyses. The database can be searched by studies, disease, or keywords and access to datasets can be requested. This website is maintained by the National Center for Biotechnology Information (NCBI). (2006).

The Human Genome Variation database of Genotype-to-Phenotype informationExternal
This website lists and describes genetic association studies. Studies can be searched by phenotypes, genetic markers, or results of interest. Study designs and results can be compared using the browser tool. It is maintained by a team led by Professor Anthony Brookes.

GAPPNETExternal
This organization aims to accelerate and streamline effective and responsible use of validated and useful genomic knowledge and applications. GAPPNET has 4 functions: knowledge synthesis and dissemination, evidence-based recommendations development, translational research and translational programs; for which some tools have been developed such as ACCEExternal, EGAPPExternal, USPSTFExternal. GAPPNet was formed by CDC's Office of Public Health Genomics, NCI's Division of Cancer Control and Population Sciences, and other stakeholders in 2009.

HuGE NavigatorExternal
This website provides access to a continuously updated knowledge base in human genome epidemiology through three main section. The HuGEpedia, including PhenopediaExternal and GenopediaExternal, is an encyclopedia of human genetic variation in health and disease. HuGEtools allow users to search and mine the literature in human genome epidemiology ( HuGE Literature FinderExternal, GWAS IntegratorExternal, HuGE Investigator BrowserExternal, Gene ProspectorExternal, Genotype Prevalence CatalogExternal, HuGE WatchExternal, Variant Name MapperExternal, HuGE Risk TranslatorExternal ). A series of HuGE related informatics utilities and projects such as GAPscreenerExternal, HuGE TrackExternal, and Open SourceExternal are available in the HuGEmix section. Website developed and maintained by the Human Genome Epidemiology Network (HuGENet™)(2009).

IT Tools

ORCIDExternal
This website aims to solve the author/contributor name ambiguity problem in scholarly communications by creating a central registry of unique identifiers for individual researchers and an open and transparent linking mechanism between ORCID and other current author ID schemes. ORCID is an independent organization.

© 2005 Public Population Project in Genomics.
All rights reserved.
Information Usage