LogoLogo
  • About
  • Digital Scholarship
    • DS Methods Overview
      • ¶ Data Visualization
        • Basic Charts
        • Timeline
        • Treemap
        • Network
      • ¶ Mapping
        • GIS
        • Story Maps
        • Maps as Interface
      • ¶ 3D & Immersive Technologies
        • Augmented Reality & Virtual Reality
        • 3D Modeling & Laser Scanning
        • Immersive Games
        • 360 Degree Capturing
      • ¶ Digital Exhibits
        • Example Exhibits
      • ¶ Hypertext
        • Publishing & Presenting
        • Multimedia
        • Narratives & Games
      • ¶ Textual Encoding Initiative
        • What Does TEI Markup Look Like?
        • Facsimiles & Critical Editions
      • ¶ Text Analysis
        • Out of the Box vs Coding and Scripting
        • Text Analysis Examples
    • Introduction to Data
      • ¶ What is Data?
        • Structured & Unstructured Data
        • Quantitative & Qualitative Data
        • Humanities & Data
      • ¶ What is Data Visualization?
      • ¶ DS Data Projects
        • Getting Started Questions
        • Project Examples
        • Visualization Tools
      • ¶ Research Data Lifecycle
        • Data Management Best Practices
      • ¶ Glossary
    • Introduction to Mapping
      • ¶ What is Spatial Data?
      • ¶ Vector and Raster Data
        • Vector and Raster Data Examples
        • File Format Examples
      • ¶ Starting a Mapping Project
        • Getting Started Questions
        • Project Examples
        • Mapping Tools and Platforms
    • Introduction to Digital Exhibits
      • ¶ What is a Digital Exhibit?
        • Related Concepts
      • ¶ Starting a Digital Exhibit
      • ¶ Exhibit Examples
      • ¶ Platforms
  • Digital Pedagogy
    • ¶ What is Digital Pedagogy?
    • ¶ Considerations
    • ¶ Recommendations
    • ¶ Assignment Design
      • Learning Outcomes
      • Mode/Method/Tool Process
      • Assignment Examples
    • ¶ Evaluation
      • Assignment Criteria
    • ¶ Maintenance & Archiving
      • Recommended File Formats
  • Accessibility
  • Skills
  • Tools
Powered by GitBook
On this page
  • Unstructured Data
  • Semi-structured Data
  • Structured Data
Export as PDF
  1. Digital Scholarship
  2. Introduction to Data
  3. ¶ What is Data?

Structured & Unstructured Data

Data can be in three different forms: unstructured, semi-structured, and structured.

Unstructured Data

Unstructured data is, essentially, a bucket of content or data points that are not organized and categorized. A folder full of images and digitized texts are a form of unstructured data. (In both cases, steps can be taken to structure them, however.)

Examples of unstructured data:

  • Text files: such as word documents, PDFs, TXT files

  • Multimedia content: image files, such as TIFF, JPEG, audio/video files. such MP3, MP4

  • Qualitative data: such as survey responses, interview transcripts

Semi-structured Data

Semi-structured data lies midway between structured and unstructured data. It doesn't have a specific relational or tabular data model but includes tags and semantic markers that scale data into records and fields in a dataset. Common examples of semi-structured data are JSON and XML.

The following is an example of semi-structured data using JSON. The data describes an author's work.

{

    "name": 
    {
"surname": "Lee",

    "given-name": "Julia",
    
"viaf_id": 49595329
    
},

    "role": "author",

    "degrees": 
    "Ph.D.",

    "affiliation": 
        {
"class": "academic institution",

        "institution": "Acadimia as Colonialism"

        }

    }

Structured Data

Structured Data is data that is organized and categorized so that it can be more effectively analyzed, in particular by tools like databases and data visualization applications.

Understanding a little about structured data provides a lot of insight into how data works in various data tools. Data is structured in a tabular form (spreadsheets) or tables created using coding and markup languages. For the sake of simplicity, we will look at structured data through the lens of tabular data.

Structured Data Example

Tabular data, what we think of as spreadsheets, is structured data organized in rows. Rows represent a record (or unit of analysis) and each column represents a different attribute (also referred to as a variable or field).

An attribute describes everything that falls within it or, in this case, underneath it. Think of it like tagging. Everything in a column is tagged by the attribute. Each horizontal line is a row, and a single row makes up what is called a record, meaning a series of data points that go together.

To put tabular data or a spreadsheet into a more relatable context, here is an imaginary DMV database spreadsheet.

Notice that each data point falls under the appropriate attribute and each row represents a single driver's license (a record). Also notice that none of the driver's license numbers repeat. These are unique identifiers that help distinguish records from one another when information is the same or very similar. Moreover, the unique identifier is a datapoint by which the record can be searched.

As shown in the example, structured data is highly organized and easily understood by machine language. Those working within relational databases can input, search, and manipulate structured data relatively quickly using a relational database management system (RDBMS).

Previous¶ What is Data?NextQuantitative & Qualitative Data

Last updated 4 years ago