¶ Research Data Lifecycle
Last updated
Last updated
Research data has a "life cycle" that describes and identifies the steps to be taken at the different stages of the research cycle to ensure successful data curation and preservation. The research data lifecycle can be divided into two main parts, Active Research Stage and Post-Active Research Stage.
During the active research stage, research activities mainly include data planning, acquiring, and analysis; while during the post-active research stage, the focus is on long-term data preservation, sharing, and re-use (Also see, Data Management Best Practices).
Planning - The stage it is determined how data will be managed. Typical considerations include:
The type and format of data will be used.
Whether any collected data will involve human subjects.
Where the data will be stored and whether it will be re-used or shared at the end of the project.
Acquire (or "Find") - The stage of when data is found or collected. There are a few steps that can help you develop your approach:
Define your topic as specifically as possible. For example:
What is the average SAT score by race for the last 10 years?
Identify the unit of analysis, meaning what you will specifically be analyzing and by what measure. For example:
Geographic unit, e.g., local, national, international
Frequency, e.g., annual, quarterly, daily
Unit of analysis, e.g., individual, institution
Time series, e.g., cross-sectional, longitudinal (or panel)
Identify data sources. For example:
Government agencies, e.g., census
Organization, e.g., International Monetary Fund (IMF)
Commercial Subscription Services, e.g., Inter-University Consortium for Political and Social Research (ICPSR), Statista
Collaborate and Analyze - The stage of your (and your collaborators') acitve use of the research data.
What data processing tool(s) are you using? e.g., Excel, Stata, SPSS, Python, R
What kind of data are you working on? e.g., numerical, categorical, text
What kind of data tasks are you performing? e.g., data cleaning, descriptive statistics.
Are you working in a team and is there a designated project manager?
Are you looking for a web-based tool for working on your data?
Store and Preserve - The planning stage for how the data will be archived for long-term preservation. Considerations include:
What archive/repository/database have you identified as a place to deposit data? e.g., Dataverse
How long will data be kept beyond the life of the project?
What metadata schema will you use? Established domain-specific repositories will usually only accept data that meet their standards for file formats, documentation and metadata, e.g., DublinCore
Share (or Publish) - The stage in which data is shared (or re-used) after a project. Some considerations include:
Through what resources/platforms the data be made available, e.g., a server or data repository
When the data will be made available, e.g., immediately or after a 12 month embargo
If the dataset was collected by the researchers, how it will be licensed to others e.g., a Creative Commons licenses
Discovery and Re-Use - this stage involves facilitating data sharing, which refers to publicly sharing data from completed (parts of) research, and having data reusable, i.e. outside your project or research team.
Whether any permission restrictions need to be placed on the data, e.g., non-commercial use
What are the intended or foreseeable uses of the data and who are the users
The following video explains the data management activities that can take place at different stages of the research process.