¶ What is Data?

Data is a collection of facts, statistics, measurements, and the like that are recorded (or should be recorded) using standardized methods. It is the smallest or rawest form of information and, as such, requires analysis and interpretation. A variety of means are used to collect data, some of which include questionnaire interviews, document analysis, machine measurements, and web scraping.

The terms "data" and "statistics" are often used interchangeably, however, in scholarly research, there is an important distinction between them. Data are individual pieces of factual information recorded and used for the purpose of analysis. It is the raw information from which statistics are created. Statistics are the results of data analysis, meaning its interpretation and presentation.

What Do Data Research Questions Look Like?

The following represent questions that would benefit from a data-oriented analysis and data DS methods, e.g., data visualization.

  • Where in the texts and how often do children speak in Virginia Woolf's novels?

  • What does Rodolfo Gonzales' correspondence reveal about his political networks?

  • How closely does the rate of heart disease in adults correlate with economic class, race, gender, and area type (i.e., urban, suburban, or rural)?

  • How do the rates of African American population increase in Philadelphia and Los Angeles between 1916-1940 correlate with changes in housing laws and redlining practices in both cities?

