Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Data visualization is the graphical representation of data, which researchers use to identify patterns, trends, outliers, etc., and for creating visual evidence to support a scholarly claim. The range of visualization chart types vary considerably, e.g., scatterplot, bar chart, and line graph. The kind you will see in the examples here are timeline, maptree, and network. GIS, a form of data visualization that uses geospatial data, will be discussed in the mapping section. (To learn more about data and data visualization, see Introduction to Data.)
The ability to engage with and manipulate 3D models of real-world artifacts and spaces can allow for a deeper understanding of the objects and space in question (particularly in comparison to a traditional 2D image). For example, a recent collection of public domain cultural heritage models from museums and libraries from across the world numbering more than 1700 models has been made freely available to view and download.
The creation of 3D objects can be divided into two general processes, creating models from photos called photogrammetry, which involves the construction of a 3D digital surface (known as a mesh) based on a series of overlapping photographs of an object, and laser scanning, which is the process of using lasers bouncing off from an object to identify the shape of its surface.
There are a variety of ways to make your 3D model publicly available, either as part of a research question or in the classroom. The models below are shared using Sketchfab, an online hosting platform.
Here is a 3D model of a pre-hominid skull made for a BC Biology class. (It is made up of approximately 150 images and the mesh is made of 1.5 million triangles.):
Here is a 3D model of the "Hail Flutie" statue set up outside of BC's Alumni Stadium. (It is made up of approximately 75 photos and the mesh is made of 762k triangles.):
3D models can also be 3D printed, such as this print of a Roman statue:
There are many basic visualization types. Which one you use depends upon the kind of data you are displaying, how you want people to engage with it, and the kind of story you are trying to tell.
Some of the most common types of data visualization chart and graph formats include:
Bar charts are one of the most common data visualizations. You can use them to quickly compare data across categories, highlight differences, show trends and outliers, and reveal historical highs and lows at a glance. Bar charts are especially effective when you have data that can be split into multiple categories.
The line chart, or line graph, connects several distinct data points, presenting them as one continuous evolution. Use line charts to view trends in data, usually over time (like stock price changes over five years or website page views for the month). The result is a simple, straightforward way to visualize changes in one value relative to another.
Pie charts are powerful for adding detail to other visualizations. Alone, a pie chart doesn’t give the viewer a way to quickly and accurately compare information. Since the viewer has to create context on their own, key points from your data are missed. Instead of making a pie chart the focus of your dashboard, try using them to drill down on other visualizations.
Scatter plots are an effective way to investigate the relationship between different variables, showing if one variable is a good predictor of another, or if they tend to change independently. A scatter plot presents lots of distinct data points on a single chart. The chart can then be enhanced with analytics like cluster analysis or trend lines.
DS methods enable scholars and students to conduct research and present scholarship in a variety of modes, including spatial, temporal, textual, hypertextual, immersive, graphical, and exhibitive. DS tools, e.g., ArcGIS, and skills, e.g., coding, are the means by which methods are enacted.
Here you will find an overview of different methods from a conceptual standpoint. To learn more about enacting methods and using tools, visit BCDS Learn.
Geographic Information Systems (GIS) is the combination of geospatial software (e.g., ArcGIS), tools (e.g., a GPS receiver), and geospatial data. While GIS is a form of data visualization, it also falls under the category of mapping. In the examples below, you can see how GIS can be used for visualizing all kinds of data including statistics and geographic areas. GIS is used for creating both static maps, such as the kind one sees in presentations and books, and interactive maps that can be shared online.
This presentation poster created for the 2019 BC Libraries GIS contest has maps created in ArcGIS (see more information about the project in BC's eScholarship).
This interactive map was created in Leaflet using curated spatial data of Gabii, an archaeological site outside of Rome.
Visit to interact with this map.
The map's interactive nature allows users to click on individual features to find out more about them as well as perform actions such as take measurements, search by feature number, and turn on and off different years of aerial imagery to see how the excavation evolved.
Mapping Islamophobia is an example of how GIS visualizes geospatial along with statistical data.
Visit to interact with this map.
Story mapping platforms are applications that use a variety of maps, text, and multimedia elements to present interactive narratives that engage users and provide instantly-accessible geographic context to any project.
This story map, created using Knightlab's StoryMaps, combines text, video, and images to highlight how Chicago's dialogue with classical antiquity has shaped the city's look, reputation, and identity.
Visit to interact with this story map.
This story map, which explores the current healthcare system and soaring cancer drug prices, was created using ArcGIS Online Story Maps. A more flexible and complex platform than Knighlab's version, it combines text, various media elements, and a variety of maps and data visualizations.
Visit to interact with this story map.
In digital scholarship, maps are often one component of a larger project and, in some cases, function as an interface to other aspects of a project.
Witches, a University of Edinburgh digital project that visualizes the locations of witch trials from 1550-1750, links a map marked with witch trial locations to The Survey of Scottish Witchcraft database.
Digital scholarship (DS) is the critical use of digital methods and tools in conducting research, presenting scholarship, and teaching. Digital humanities (DH) falls under the umbrella of DS and incorporates humanities specific practices and methodologies.
DS is a way in which faculty, librarians, and students engage with new areas of scholarship and engage with traditional scholarship in innovative ways.
The use of 3D and immersive technologies, including 3D scanning, alternate reality (AR), virtual reality (VR), and immersive games, in academia has greatly increased over the past decade as scholars have explored how they can be best used in research and in the classroom. 3D technologies include the variety of ways that three-dimensional digital representations of real-world features can be created and shared. Immersive technologies take this experience a step further, using these digital representations to create a larger digital experience with its own narrative or argument that the user can experience firsthand.
Network visualization illustrates connections and relationships between different entities. As to be expected, the intricacy of the networks will impact the complexity of the visualization.
This relatively simple network visualization illustrates the relationships between characters in the film, Lord of the Rings: The Fellowship of the Ring.
Visit to interact with this visualization.
"Six Degrees of Francis Bacon," a more complex network visualization of early modern social networks, is a collaborative project to which multiple scholars and students from around the world have contributed.
Visit to interact with this visualization.
Augmented reality (AR) and virtual reality (VR) offers increased user engagement with the subject matter in question than a traditional image. AR involves applying a simulated layer that allows users to experience a layered, computer-generated enhancement to their real-world perception. VR involves creating and/or experiencing a computer-simulated world. AR and VR technologies are continuously evolving with AR, in particular, becoming easier and more affordable to create.
BC Libraries' Digital Studio has been working with the Center for Digital Innovation in Learning (CDIL) and Apple to develop ways to integrate AR into the classroom through Apple's new Reality Composer platform. The examples below will open on any iPhone or iPad and represent, in the first case, a "born-digital" object (the robot) while the lion stamp is a real-world object of which a 3D representation was made in the Digital Studio.
An animated robot AR model from the Apple Quick Look gallery projected on to a real-world table:
Lion-headed stamp AR model created using ARKit:
Created by Jessica Linker and her Bryn Mawr College students, this VR reconstruction of an early 1900s biology lab was based on Bryn Mawr archival records. It was created with Unity, a commonly used VR platform.
Visit to interact with this project.
Timelines are a temporal form of data visualization and, depending on the tool, allow for varying degrees of detail and complexity.
This basic timeline of Mary Shelly's life was created with TimelineJS, a simple tool that allows for a single linear visualization.
Visit to interact with this timeline.
"Conflicts of the World," a more complex timeline than the one above, was created with Tableau Public, a popular tool used for creating multiple types of data visualization.
Visit to interact with this timeline.
CSU Japanese American Digitization Project: An Exhibit uses Japanese-American internment documentation and other artifacts to tell a narrative. The linear format is facilitated by Scalar, the platform with which it is published and one that allows for the creation of narrative paths.
Visit to explore this exhibit.
The Burns Library's The Object in the Archive provides a more traditional exhibit experience, meaning one that groups objects under unifying themes and topics. Originally physically displayed in the Burns Library, it is also an example of how a physical exhibit can be translated into digital.
Visit to explore this exhibit.
Goin' North uses Omeka, an easy to use exhibit and digital collection platform. In addition to incorporating historical visual materials, the exhibit features a number of oral histories.
Visit to explore this exhibit.
Like physical exhibits, digital exhibits require research, curation, the writing of captions and other contextual information, information organization, and design. Beyond displaying and contextualizing digitized or born-digital materials like historical documents, photographs, film footage, or audio clips, digital exhibits allow for such things as linking to outside resources and the incorporation of a variety of digital objects like maps and timelines. Digital exhibits often come from existing digital collections or begin with the creation of a digital collection, which requires digitization and metadata generation among other steps.
A hypertext is a digital text that links to sections or pages within the text, to other texts, to media elements, and the like. Websites are the most common hypertext example. EBooks and eJournals are also often hypertextual, though not always with the same level of intricacy as a website. When designing a hypertext there are a number of technical and intellectual considerations including, but not limited to, information architecture, information organization, taxonomies, wayfinding, and navigation. All of these considerations impact usability and user experience. In digital scholarship, hypertexts are ways of publishing scholarship, e.g., making data visualizations and analysis available online. Through their structure, design, and the inclusion of media, they are also used to present scholarly arguments and tell narratives.
Hypertexts can be structured to create narrative paths and games. While a typical website platform like WordPress can be used to create such experiences, tools like Twine and Scalar were designed specifically for such purposes.
A Case of Hysteria, also an example of a digital exhibit, uses Scalar to create paths that take users through different topics.
Visit the site to explore the paths.
The Anachronist, created in Twine, is an example of both a narrative and a game in which players make choices that determine the outcome.
Visit to interact with the game.
Mapping is a broad term that describes adding marks, layers, images, etc. to maps. While GIS is part of mapping, not all mapping is GIS. Adding a pin to a Google Earth map, for example, is not GIS. Adding a data layer and using that data layer to perform an analysis is.
Textual Encoding Initiative (TEI) is a consortium and an XML standardized markup language used for encoding literary texts, historical documents, and the like. During the process, scholars use TEI to "markup" texts to indicate different aspects such as titles, chapter headings, line breaks, and handwritten marginalia, as well as to note significance and inscribe interpretation. Most commonly TEI is used to create critical editions and facsimiles.
Related terms:
Markup languages use tags to define elements within a digital document. XML (Extensible Markup Language) is used for encoding texts and employs a number of different standards and schemas.
Markup tags are in and out marks that “wrap” a text element, e.g., a title, header, single word, paragraph, etc. Below is a simple example using the more commonly known markup language, HTML (Hypertext Markup Language):
In a web browse the above looks like:
Their Eyes are Watching God was written by Zora Neale Hurston in 1937.
The <h2> tag marks the title as a header and determines the specific size of the header. The <p> tag marks the sentence as a paragraph or a general body of text and the <em> tag marks the title as emphasized (italicized).
The following example is from The Walt Whitman Archive. It shows a portion of the markup written to create a facsimile of the original manuscript for Whitman's poem "The Argument."
As you can see in the TEI facsimile below, the Archive seeks to replicate the edits made in the manuscript as well as the layout of the text:
The TEI markup below looks very different from the facsimile but, if you look closely, you can probably understand some of what the markup is doing.
VR and AR based games, and interactive stories more broadly, are becoming more common and, with the increasing availability of VR and AR technology, are becoming easier to make. (Unity, one of the most popular gaming engines, offers free access to students and individuals for educational use.) The incorporation of games and the creation of them are also becoming more common in scholarly work and in the classroom where they promote multimodal learning patterns and offer a deeper understanding of subject matter.
VR based games involve fully computer-generated worlds. An example is Rome Reborn, which can be viewed on Oculus Rift and HTC Vive.
Closer to home, BC English professor Joseph Nugent and his team created “Joycestick,” an adaptation of James Joyce's Ulysses into an immersive, 3D virtual reality computer game developed in Unity. Users don a VR eyepiece and headphones and, with gaming devices, navigate and explore various scenes from the book.
AR based games involve applying a computer-generated layer onto the real world. Pokémon GO and Harry Potter: Wizards Unite mobile games are two.
The hypertext medium affords scholars the opportunity to publish and present works in many different ways. As an example, the Digital Dante website contains scholarship ranging from commentary and analysis to a Divine Comedy digital edition to multimedia representations and responses to the author's works.
Visit to explore Digital Dante.
This treemap visualization, which shows exports around the world, is from the Harvard Growth Lab's The Atlas of Economic Complexity and was created using a platform designed by the Lab.
Text analysis involves using digital tools and one’s own analytical skills to explore texts, be they literary works, historical documents, scientific literature, or tweets. Approaches can be quantitative (e.g., word counting) and qualitative (e.g., topic modeling and sentiment analysis), and tools can range from coding and scripting languages to "out of the box" platforms like Voyant and Lexos.
In the humanities, text analysis is closely associated with the concept of distant reading, which essentially means using computational methods to explore and query large (sometimes massive) corpora. The corpa or datasets, as they are more commonly called in the sciences and social sciences, can be structured or unstructured, and the results can have a data visualization component.
Related Terms:
Text mining (a term used more in the humanities), data mining (a term used more in the sciences and social sciences), and web scraping are techniques that use coding, scripting, and "out of the box" tools to gather text and create a corpus (or dataset).
includes TEI based facsimiles of Newton's alchemical manuscripts, critical materials, an analysis tool, and educational resources.
(Note that you can click on "page image" to see an image of the original manuscript.)
is a critical edition, scholarly study, and textual archive of the Old English poem Cædmon’s Hymn. It is a good example of the use of TEI in medieval studies, which is heavily focused on manuscripts.
Capturing the world in 360 degrees offers a different kind of immersive experience, one that began with the ability to create panorama shots with a phone and now is a common feature on everything from apartment tours to tourist attractions.
360 degree images, videos, and more complex experiences are becoming easier to create and share. In the academic world, these kinds of images can give the user a better understanding of a space or a more engaging digital experience on a particular subject. In many cases, these kinds of digital products can be combined with other immersive technologies, such as 3D modeling or VR.
360 tours are a common way to utilize these kinds of images, which can combine a "Google Street" view-like experience created from 360 images with further information. Recently, special exhibitions at BC's McMullen Museum of Art have been using 3DVista to create 360 versions of their exhibitions, including Indian Ocean Current: Six Artistic Narratives.
360 tours also provide access to spaces people might not be able to travel to or visit. One example is the Anne Frank Family home, which is part of Google Arts and Culture.
Visit to explore the Frank Family home.
360 experiences can be combined with photogrammetry/laser scans, like this digital model of the interior of the Scrovegni Chapel (Padua, Italy).
Finally, 360 has entered the world of video, with 360 videos now available for locations such as Petra, Jordan.
Multimedia projects are typically created in a hypertext format. This project, Sound and Documentary in Cardiff and Miller's Pandemonium, was created as a digital companion to a traditional master's thesis. Using Scalar, an open-source publishing platform, the project weaves images, video, and audio to bring greater light and meaning to the research topic.
Visit to interact with the project.
In this text analysis example, Ted Underwood and David Bamman used BookNLP, a Java-based natural language processing code, to explore gender in 93,708 English-language fiction volumes. They articulate one of their major discoveries as follows:
There is a clear decline from the nineteenth century (when women generally take up 40% or more of the “character space” in fiction) to the 1950s and 60s, when their prominence hovers around a low of 30%. A correction, beginning in the 1970s, almost restores fiction to its nineteenth-century state. (One way of thinking about this: second-wave feminism was a desperately-needed rescue operation.)
Visit their blog post to learn more about their methods and discoveries.
Here CORD-19, a database containing thousands of scholarly articles about COVID-19 and other related coronaviruses, provides a topic model and visualization of 2437 journal articles. The approach they used, latent Dirichlet allocation (LDA), is a natural language processing based generative statistical model.
Visit to interact with the visualization.
Data visualization refers to representing data in a visual context, like a chart or a map, to help people understand the significance of that data. Visualization is a frequent final output of research. Putting some time and strategic thought into data visualization at the beginning of a research project can help you create more effective visualization. (For more on data visualization, see the "Data Visualization" section in DS Methodologies Overview.)
Data visualization is usually one of three types:
Scientific visualization, meaning the representation of scientific phenomena that tend to be tied to real-world objects with spatial properties e.g., modeling airflow over an airplane.
Information visualization under which falls most statistical charts and graphs and also includes other visual and spatial representations.
Infographic, meaning a specific sort of visualization that combines information visualization with narrative.
In the video below, David McCandless talks about how we can use visualizations to make data more meaningful. He explains who he turns complex data sets (like worldwide military spending, media buzz, Facebook status updates) into beautiful, simple diagrams that tease out unseen patterns and connections.
Data is a collection of facts, statistics, measurements, and the like that are recorded (or should be recorded) using standardized methods. It is the smallest or rawest form of information and, as such, requires analysis and interpretation. A variety of means are used to collect data, some of which include questionnaire interviews, document analysis, machine measurements, and web scraping.
The terms "data" and "statistics" are often used interchangeably, however, in scholarly research, there is an important distinction between them. Data are individual pieces of factual information recorded and used for the purpose of analysis. It is the raw information from which statistics are created. Statistics are the results of data analysis, meaning its interpretation and presentation.
The following represent questions that would benefit from a data-oriented analysis and data DS methods, e.g., data visualization.
Where in the texts and how often do children speak in Virginia Woolf's novels?
What does Rodolfo Gonzales' correspondence reveal about his political networks?
How closely does the rate of heart disease in adults correlate with economic class, race, gender, and area type (i.e., urban, suburban, or rural)?
How do the rates of African American population increase in Philadelphia and Los Angeles between 1916-1940 correlate with changes in housing laws and redlining practices in both cities?
Digital scholarship data projects usually involve data visualization and/or the creation of databases for the purposes of making data more manageable, navigable, and intelligible. Depending on the tools and methods used, different types of visualizations can be achieved and queries run for asking and answering scholarly questions. (See examples.)
Among other takes, data projects require planning, the acquisition of existing data or collecting of new data, data cleaning, and structuring, and, of course, analysis. (Also see, Research Data Lifecycle.)
BC Libraries' Data Services facilitates, supports, and consults on data acquisition, management, curation, and visualization as well as design and provide data-related in-class instruction and workshops. BC's Research Services also provides support as well as licenses for platforms like ArcGIS.
Text analysis can be done using "out of the box" tools or coding and scripting with the latter approach enabling scholars to explore more nuanced research questions.
Using "out of the box" tools, which don't require coding or scripting, is a good way to get started in text analysis as it will help users begin to understand possibilities and techniques. Voyant and Lexos are examples of such tools. (Mallet, used for topic modeling, is an example of a tool that requires coding but also provides users with a lot of guidance and preexisting code.)
Here is a Voyant instance that contains all of Shakespeare's plays. Stopwords like "thou" and "sir" have been applied to prevent them from dominating the results. (The selection of stopwords is part of the scholarly decision making that goes into text analysis.)
Visit to interact with this Voyant instance.
Coding is an umbrella term that involves using coding (or programming) languages to do things like create applications and websites. Scripting falls under coding and involves using coding languages to do things like automate processes and make websites more dynamic. Coding and scripting are typically done using a computer's command line or platforms like Jupiter Notebooks.
To get a sense of what coding and scripting look like in text analysis, here is a basic example from the Natural Language Toolkit, which uses the Python language. Here you can see a script being run that tags the parts of speech in the sentence, "And now for something completely different." (CC = coordinating conjunction, RB = adverb, IN = preposition, NN = noun, JJ=adjective. )
In this example from Programming Historian, you see a portion of a Python script used for counting word frequencies.
Research data has a "life cycle" that describes and identifies the steps to be taken at the different stages of the research cycle to ensure successful data curation and preservation. The research data lifecycle can be divided into two main parts, Active Research Stage and Post-Active Research Stage.
During the active research stage, research activities mainly include data planning, acquiring, and analysis; while during the post-active research stage, the focus is on long-term data preservation, sharing, and re-use (Also see, ).
Planning - The stage it is determined how data will be managed. Typical considerations include:
The type and format of data will be used.
Whether any collected data will involve human subjects.
Where the data will be stored and whether it will be re-used or shared at the end of the project.
Acquire (or "Find") - The stage of when data is found or collected. There are a few steps that can help you develop your approach:
Define your topic as specifically as possible. For example:
What is the average SAT score by race for the last 10 years?
Identify the unit of analysis, meaning what you will specifically be analyzing and by what measure. For example:
Geographic unit, e.g., local, national, international
Frequency, e.g., annual, quarterly, daily
Unit of analysis, e.g., individual, institution
Time series, e.g., cross-sectional, longitudinal (or panel)
Identify data sources. For example:
Government agencies, e.g., census
Organization, e.g., International Monetary Fund (IMF)
Commercial Subscription Services, e.g., Inter-University Consortium for Political and Social Research (ICPSR), Statista
Collaborate and Analyze - The stage of your (and your collaborators') acitve use of the research data.
What data processing tool(s) are you using? e.g., Excel, Stata, SPSS, Python, R
What kind of data are you working on? e.g., numerical, categorical, text
What kind of data tasks are you performing? e.g., data cleaning, descriptive statistics.
Are you working in a team and is there a designated project manager?
Are you looking for a web-based tool for working on your data?
Store and Preserve - The planning stage for how the data will be archived for long-term preservation. Considerations include:
What archive/repository/database have you identified as a place to deposit data? e.g., Dataverse
How long will data be kept beyond the life of the project?
Share (or Publish) - The stage in which data is shared (or re-used) after a project. Some considerations include:
Through what resources/platforms the data be made available, e.g., a server or data repository
When the data will be made available, e.g., immediately or after a 12 month embargo
If the dataset was collected by the researchers, how it will be licensed to others e.g., a Creative Commons licenses
Discovery and Re-Use - this stage involves facilitating data sharing, which refers to publicly sharing data from completed (parts of) research, and having data reusable, i.e. outside your project or research team.
Whether any permission restrictions need to be placed on the data, e.g., non-commercial use
What are the intended or foreseeable uses of the data and who are the users
The following video explains the data management activities that can take place at different stages of the research process.
What metadata schema will you use? Established domain-specific repositories will usually only accept data that meet their standards for file formats, documentation and metadata, e.g.,
Introduction to Data explains fundamental concepts that inform data-related research, the use of data manipulation tools, and data project creation.
The following questions are helpful to consider when beginning a data project.
Are you looking for data & statistics with a time period or geography focus?
Are you looking for a specific data type? e.g., qualitative, qualitative, GIS, multimedia
Are you collecting your own data for your research?
Have you started searching for data sources?
Do you need support on data management (DMP), preservation, or sharing?
What data format are you using? e.g., Excel, Stata, SPSS
What data tasks do you need to conduct?
Data cleaning: the process of preparing data for analysis by removing or modifying data that is incorrect, incomplete, irrelevant, duplicated, or improperly formatted.
Data merging: the process of combining two or more data sets into a single data set.
Aggregation (Summarization): the process of gathering data and presenting it in a summarized format.
What format does your data come in? e.g., Excel, text file, JSON, PDF, spatial
Do you have any preferred visualization tool you want to use?
Do you need help with choosing the visualization tools?
The following are some best practices that should be considered prior to starting a data project and provide guidance for managing data in the Research Data Lifecycle's post-active research stage.
To prevent data from being lost to incompatibility, store it as formats and on hardware that are open standard, not proprietary.
In your documentation, use metadata to record details about the data collection process (e.g., a study) such as:
its context
the dates of data collections
data collection methods, etc.
Sharing data makes it possible for researchers to validate research results and to reuse data for teaching and further research. Sharing is also required by an increasing number of funders and publishers. Funders seek to maximize the impact of the research they fund by encouraging or requiring data sharing.
Depositing to an established repository will help to ensure that data are consistently available and accessible, and preserved for future use. Choosing a data repository can be determined by various factors, such as discipline, accepted data format, data sharing policies and etc. You can obtain assistance from Data Services to identify a repository to publish your research data.
Attributes are the describing characteristics or properties that define all items pertaining to a certain category applied to all cells of a column.
Data is a collection of facts, statistics, measurements, and the like that are recorded (or should be recorded) using standardized methods.
Data collection is a systematic process of gathering observations or measurements.
Data Visualization is a graphical representation of data.
Metadata is often simply defined as "data about data" or "information about information".
Data points are single units of data or single observations, e.g., a single measurement or a single geolocation point.
A Database is a systematic collection of data.
Dataset (or data set) is a collection of data. Typically, it is structured and housed in a tabular form (e.g., a spreadsheet).
The data life cycle represents all of the stages of data throughout its life from its creation for a study to its distribution and reuse. The data lifecycle begins with a researcher(s) developing a concept for a study; once a study concept is developed, data is then collected for that study.
Data Literacy is the ability to read, understand, create, and communicate data as information.
Geospatial data is defined in the series of standards as data and information having an implicit or explicit association with a location relative to Earth.
Quantitative data relates to the quantity of something, and typical examples of quantitative data are numbers.
Qualitative data is used to characterize objects or observations, which can be collected in a non-numerical and non-binary way, such as languages.
Structure data refers to data that resides in a fixed field within a file or record, e.g., spreadsheet.
Unstructured data refers to a bucket of content or data points that are not organized and categorized, e.g., PDF files, image files.
Type of Data
Recommended Formats
Formats Acceptable
Plain Text
txt, pdf/A xml
docx, doc, rtf
Tabular Text
csv, tsv
xlsx, xls, sav, dta
Image
tiff, JPEF2000
jpg, psd, png, gif, bmp
Audio
wave, aiff
mp3, wma, aac, ogg
Archiving
zip
rar
Video
motion jpg 2000, mov, avi
mpeg-4
Subject/Discipline
Example Archive/Repository
Ecology
DNA Sequences
Chemistry
Social Sciences
There are a variety of data visualization tools available, many of them open source, to help you explore existing data visualization or to create your own. Below are a few examples.
Excel is a powerful tool for getting meaning out of vast amounts of data and offers a library of chart and graph types to help users visualize their spreadsheet data.
Tableau is a data visualization and analytics platform that enables users to connect to a variety of data sources and explore the data in a simplified way. The drag and drop interface makes it very easy to visualize and create interactive dashboards without any programming skills. (Browse the Tableau public gallery to see examples of visuals and dashboards.)
Palladio is a web-based data visualization tool for analyzing relationships across time and visualize historical or cultural networks.
Gephi is free software for visualizing networks. The main website hosts official tutorials and also links to popular community-developed tutorials.
D3.js is a JavaScript library for producing dynamic, interactive data visualizations in web browsers. It is ideal for people who want to develop some JavaScript Programming skills and offers great power and flexibility.
While data-oriented scholarship is perhaps more often associated with the sciences and social sciences, it has as much purpose and relevance in the humanities.
Data visualization can be used to illustrate social networks, how information spreads over time and place, historical, literary, and intellectual trends, and much more. The Belfast Group Poetry visualizes literary networks and Geography of the Post visualizes the spread of the US Postal Service in the nineteenth century.
Database creation also makes up a considerable amount of humanities data-related scholarship. Such databases often incorperate primary sources and facilitate the asking and answering of research questions. They Came on Waves of Ink is a database created from a nineteenth-century Puget Sound Customs District ledger and Enslaved.org, a highly collaborative and grant-funded project, is a database created from slavery-related records provided by different archives and datasets from existing projects like Voyages: The Trans-Atlantic Slave Trade Database.
The following examples of vector and raster data demonstrate some of the many forms the data can take.
Vector data in mapping generally appears either as points, lines, or polygons. In this example of polygons as raster data, spatial information tied to the data defines the shape's boundaries while further information (such as a title and image) is included as additional attributes. Attributes can include any type of additional information that is useful for a viewer to know about a particular location appearing in your dataset.
A basemap is a georeferenced raster image. They help give a larger context to vector data sets, which would otherwise simply appear as points, lines, or polygons on a blank space. Satellite imagery and other top-down images (orthophotos) are the most common raster data of this kind, seen, for example, when using satellite imagery from GoogleMaps as seen below.
Raster thematic maps are similar to surface maps as they use attributes of a particular landscape, be they physical or cultural. Combining several types of data to create what are called thematic data sets, thematic maps group together certain specified attributes from vector or raster data into specific categories, and then mapping out the categories.
The example below shows vegetation types from a dataset that breaks land-cover types into categories. The data is multispectral, meaning it is the acquisition of reflected wavelength data in the visible, near infrared, and short-wave infrared spectrums.
Raster surface maps contain attributes that mark change over a particular landscape instead of representing the visual world as basemaps do. Common attributes indicated on surface maps include elevation (which is specifically termed a digital elevation map, or DEM), rainfall, or temperature. Once these types of surface maps are imported into a GIS platform like ArcGIS or QGIS, they can be .
The following questions are helpful to consider when beginning a mapping project. They are broken down into different sections based on different potential parts of a mapping project.
Note that some questions are good to consider in multiple contexts and that you do not need to be able to answer them all to get started. If you need assistance, contact BC's Digital Scholarship Group.
What are the goals for my project?
What is your research topic or subject focus?
Generally, where does my mapping project fit into the visualization - analysis - storytelling trifecta?
What is the timeline for my project?
What do I imagine the final product of my project might look like?
How can I obtain the data I need for my project?
Are you collecting your own data for your research or looking for data from a data provider (e.g. census or other governmental data)?
Are you dealing with vector data, raster data, both, or unsure?
Are you gathering spatial data from "pre-digital" sources, such as historical maps, that need to be digitized (the maps themselves and/or the spatial data inherent within the maps)?
Are you looking for spatial data for a certain time period or geographical focus?
Have you started searching for data sources? What are the most relevant ones?
How can I prepare my data to be used in a mapping tool?
What format(s) is your spatial data in (e.g. csv/spreadsheet, shapefile, kml, geojson, etc)?
Do you need to add qualitative or quantitative data to your vector data?
Do you need to clean your data (i.e. make the spatial data internally consistent, fix errors within the attributes of the data, etc)?
Do you need to merge your data (i.e. bring together or combine multiple data sources into one source that contains spatial data, possibly with different attributes or information attached)?
How do I want people to be able to see or engage with my data?
How much data do I have, and is it divided into multiple layers or types?
How do I imagine users engaging with my data (as a pre-created fixed figure, as an interactive map, etc. Also see Sharing below)?
How can my visualization emphasize the argument I am trying to make with the spatial data? What is the main idea I want to convey to my audience?
Is spatial analysis part of my mapping project goals?
What question am I trying to answer with spatial analysis (i.e. what kind of spatial analysis am I trying to do)?
Do I have the necessary data to answer the question I am exploring?
What GIS tool is most appropriate for me to use to answer that question, and do I have access to it (QGIS/GrassGIS/ArcGIS)?
Is storytelling (e.g. a StoryMap, a GeoTour, etc) a part of my project goals, where the spatial data integrated into a larger interactive narrative with different kinds of media?
Do you have any preferred StoryMapping tool to use (Knightlab, ArcGIS online, etc.) or do you need help with choosing the visualization tools?
Do you have a general outline of how viewers will move through your story?
What kinds of media do you want to integrate with your story (text, images, audio, video, etc)?
What kind of mapping output is ideal for my project (static figure online or printed, interactive online figure, etc.)?
How will my map be tied to my larger narrative (if desired)?
Where should my spatial data and map project appear/be hosted/preserved? Will it be accessible to the general public?
Will others be able to download my spatial data to work with on their own projects?
Mapping projects can be of all shapes and sizes, from the creation of a traditional figure with an explanatory legend and caption, such as might appear in an academic text, to an online interactive tool that allows for the searching or filtering of thousands of pieces of spatial data or hundreds of historical maps. They can combine different types of representing real world features (e.g., points, lines, or polygons which can contain both spatial data and other qualitative or quantitative information) with (e.g., satellite imagery, elevation data, vegetation data, or a historical map).
Visualization, analysis, and storytelling are three of the most common goals for any mapping project, and they may be intertwined. Visualization focuses on the presentation of a collection of data, either through straightforward stylistic choices or through the ability to allow the user to filter datasets in some way. is a process in which you model problems geographically, derive results by computer processing, and then explore and examine those results. Storytelling then combines the two, allowing the creator to present a cohesive narrative to their reader which is intrinsically tied to the data being presented at a particular moment.
BC Libraries' facilitates, supports, and consults on mapping projects, as well as designs and provides mapping related in-class instruction and . BC's also provides support as well as for platforms like ArcGIS.
: A traditional GIS analysis undertaken using ArcGIS Desktop (BC Library 2021 GIS Contest Winner); Main Goal: Data Analysis, Data Visualization; Platform: ArcGIS
: An interactive version of a traditional GIS visualization map, focused on an archaeological site outside of Rome, Italy; Main Goal: Data Visualization; Platform: Leaflet, ArcGIS
: The Authorial London project is compiling and mapping references to London places found in the works and biographies of writers who have lived there; Main Goal: Data Visualization (raster through historical maps, vector through location references); Platform: Leaflet/Mapbox (within a larger developed application)
: Artists in Paris is the first project to map comprehensively where artistic communities developed in the eighteenth-century city and offer rich scope for subsequent investigations into how these communities worked and the impact they had on art practice in the period; Main Goal: Data Visualization and Filtering; Platform: Openlayers (within a larger developed application).
: Mapping the Gay Guides aims to understand often ignored queer geographies using the Damron Address Books, an early but longstanding travel guide aimed at gay men since the early 1960s. Similar in function to the green books used by African Americans during the Jim Crow era to help identify businesses that catered to black clients in the South, the Damron Guides aided a generation of queer people to identity sites of community, pleasure, and politics. Main Goal: Data Visualization and Filtering; Platform: Leaflet within larger developed application.
: A clever use of an ArcGIS Online Story Map to discuss the use of spatial analysis in ArcGIS; Main Goal: Storytelling, Data Analysis; Platform: ArcGIS Online
This Interactive Atlas allows you to create and customize county-level maps of heart disease and stroke across the United States; Main Goal: Data Analysis, Data Visualization; Platform: ArcGIS Online
: This animated thematic map narrates the spatial history of the greatest slave insurrection in the eighteenth century British Empire; Main Goal: Storytelling; Platform: Leaflet
: An interactive, multimedia storymap detailing Arya's movements across the Game of Thrones books; Main Goal: Storytelling; Platform: Knightlab Storymaps
Digital exhibits are a form of online exhibit that, like physical exhibits, use objects to tell stories, make arguments, and demonstrate ideas. Attributes include:
Objects can be digitized or “born digital” media of various types (e.g., digitized photographs, rare books, and films or born digital government documents)
Special attention is given to the organization of both the objects and the site in which the exhibit lives.
They might include a digital collection(s) component and interactive elements such as maps and timelines.
Racing to Change - An exhibit that focuses on the Civil Rights Movement in 1960s and 70s Oregon. Platform: A developer customized site
Japanese Digitization Project - An exhibit about Japanese nationals and Japanese American WWII internment. Platform: Scalar
A Gospel of Health: Hilla Sheriff's Crusade Against Malnutrition in South Carolina - An exhibit about Dr. Hilla Sheriff, a pioneering crusader for the public health system in Progressive Era South Carolina. Platform: Wordpress (.com and .org)
Drawn to the Nines - An exhibit about the use of costuming in mid-century illustration. Platform: Omeka (.net and .org)
Quack Cures and Self-Remedies - An exhibit about dubious "cures" and patient care in the mid-19th to early 20th century. Platform: A developer customized site
#LovecraftCountry - An exhibit that examines historical, literary, and cultural events presented in the show Love Craft Country with published and primary source materials. Platform: Wordpress (.com and .org)
Digital collections are an organized and described (using metadata) group of media objects. Objects can be digitized or “born digital,” media of various types, and are usually searchable and/or browsable. Digital collections can be included in digital exhibit sites--platforms like Omeka lend themselves to this as they allow for the creation of both.
Metadata is information about different types of data, which includes media objects. Some types of metadata are descriptive, which explains characteristics of data/media objects, administrative, which explains things like resource type and copyright, and structural, which explains things like the version of the data/media object and the relationships between different objects within the collection.
Digitization is the act of making a digital version of analog media. It involves the scanning of a physical photograph and transferring of an analog audio tape to a digital audio format, for example.
Born digital refers to media that is digital in its original form, e.g., a Word doc, a photograph taken with a digital camera, and a Photoshop file.
Introduction to Digital Exhibits explains fundamental concepts that inform digital exhibit creation. Much of the information here originated with an introduction to digital exhibits workshop (see slides), which also resulted in the Getting Started with Digital Exhibits tutorial.
Depending on your goals, there are a wide variety of pre-built platforms that are useful for visualizing, analyzing, and telling stories about spatial data. Below are some common platforms, starting with ones with the generally lowest level learning curve and moving to the highest.
StoryMapJS, a free online tool developed by Northwestern University's Knight Lab, allows creators to combine spatial information with multimedia and textual data to tell location based narratives. Its simple user interface and ability to use both standard and customized underlays make it a good platform for simple story-mapping projects. Its limitations include the fact that it cannot be hosted locally and little customization in terms of styling or functionality is possible. Spatial data points must also be inserted individually either on a map or through long/lat coordinates, making work with large datasets difficult.
Main use: Storytelling, Access: Free, requires a Google account
seen below combines text, video, and images to highlight how Chicago's dialogue with classical antiquity has shaped the city's look, reputation, and identity.
Main Use: Storytelling, Simple Visualization, Access: Free, requires a Google account
With Tableau, you can create the following common map types:
is ESRI's free online data visualization tool and can be integrated directly with ArcGIS Desktop or Pro. It is possible to create a free personal account or to join the Boston College account system by contacting .
While ArcGIS Online can be useful for making simple interactive maps, where it really shines is in its storymapping functionality. Like Knight Lab StoryMapJS, allow you to create inspiring, immersive stories and tours by combining text, interactive maps, and other multimedia content. ArcGIS Online, however, is a much more flexible platform, allowing creators to present their data in a wider variety of formats and utilize a wider variety of map types. Its learning curve is steeper than Knight Lab's StoryMaps and it requires the user to have a greater understanding of how they want to organize and share their spatial data.
Main Use: Storytelling, Data Visualization, Access: Free version available, also available to BC faculty, staff, and students
from BC student Wenwei Su won the 2020 Boston College Libraries GIS Contest (digital division) for looking at health care expenditures and mortality rates in the US through the lens of the movie "Dying to Survive." (For more examples, check out the .)
The free tool (available for use online or for download to mobile and desktop) is a common tool for creating simple maps with points, lines, and polygons to share with others. The program maps the Earth by superimposing satellite images, aerial photography, and GIS data onto a 3D globe, allowing users to see cities and landscapes from various angles.
The new allow you to create a "story map" experience by adding items such as images and videos into location descriptions. It is possible to share your creation in the cloud (through standard google sharing methods) or download your spatial data as a .kml file to open on a variety of platforms.
The uses historical photographs, artwork, Google street views, and satellite imagery to tell the story of Henry Box Brown, an enslaved person who shipped himself from Virginia to Pennsylvania to obtain his freedom.
is a data visualization and analytics platform that enables users to connect to a variety of data sources and explore the data in a simplified way. The drag and drop interface makes it very easy to visualize and create interactive dashboards without any programming skills. The mapping features of Tableau Desktop give users the ability to get the answers to spatial questions. Tableau's spatial file connector allows you to easily connect to and join Esri Shapefiles, KML, MapInfo tables, GeoJSON files, and other forms of geospatial data. You can also import geographic data from R or GIS (or whatever or you have) and make it more easily accessible, interactive, and shareable. Census-based population, income, and other standard demographic datasets are built-in.
Main Use: Data Visualization, Access: is a free version available to the general public, a free version of can be acquired by academic faculty, staff, and students (see )
The example map below from the project, , shows the foreign-born population around the Boston area community from 1870 to 2010.
ArcGIS and QGIS are the two most common platforms for organizing your data into a true database and for analyzing data using common spatial methodologies. As such, they are often the go-to for someone wanting to move beyond an Excel or Google Sheet file for recording their datasets. The two platforms are similar, though only runs on Windows computers while is a free and open source GIS platform that runs on Windows and Macs. While both platforms can be used for sharing visualizations as exports in traditional file formats such as .jpg and .tiff (commonly used in publications), sharing interactive online visualizations requires integration with a secondary platform like ArcGIS Online or Leaflet.
Boston College has an ArcGIS campus license for students, faculty, and staff, so please see the for more information on how to get a license for your computer. Both platforms are available for use in the . If you just want to start from the basics in ArcGIS or QGIS, we recommend running through a few of the beginner tutorials from the or the .
Main use: Spatial Organization, Spatial Analysis, Spatial Visualization, Access: are available to BC faculty, staff, and students; QGIS is free and open source, download from
is an open-source JavaScript library for interactive web maps. It is lightweight and flexible and is probably the most popular open-source mapping library at the moment. Of the web mapping platforms discussed here, it is certainly the most powerful, but at the same time is the least user friendly, as a knowledge of coding in javascript is required. Many types of functionalities may be performed more easily using the other platforms described above, yet Leaflet (especially with its many plugins) is by far the most customizable.
Main use: Data Visualization, Access: Free and open source, download from
The example map below is an Leaflet based map from the Institute for Advanced Jesuit Studies (IAJS) Jesuit Online Necrology Project, which the lives of more than 33000 members of the Jesuit Priesthood. The map shows the myriad of locations mentioned in the necrology, allowing the user to search by location and identify the Jesuits associated with a location.
Spatial datasets, in general, come in two distinct forms, vector data and raster (or pixel data). Raster and vector data can come together in the creation of a wide variety of mapping projects, from a traditional figure with an explanatory legend and caption, such as might appear in an academic text, to an online interactive platform that allows for the searching or filtering of thousands of pieces of spatial data or hundreds of historical maps.
Vector data includes points, lines, or polygons (shapes made up of straight lines) containing spatial information that represent some sort of feature or event in a physical or imagined landscape and may contain other types of qualitative or quantitative information, called attributes. A point may represent a tree, a city, or a moment in time. Lines might indicate the street grid of a town, the path someone traveled across the world, or a social link between two communities. Polygons can mark the boundaries of a country or voting district, the catchment area of a river, or a single city block.
Raster consists of "cells" of data covering a specific area (its extent), with attribute values in each cell representing a particular characteristic. It may still consist of points, lines, and polygons, but these shapes are themselves composed of pixels (the way a jpeg or other image file type is).
Data of this type may take many forms, such as satellite imagery containing vegetation or elevation data, precipitation maps, or even an historical map, which has been given a spatial reference. Unlike vector data, raster data has a particular resolution, meaning each pixel represents a particular geographic region of a specific size.
The concept of data is discussed more broadly in Introduction to Data. Here we take a quick look at spatial data in particular. In short, spatial data adds a geographic dimension to a qualitative and/or quantitative data set, situating it in a particular location within a coordinate system relative to other data points. (The coordinate system can be a real-world system or a locally created one used to meet the needs of a particular project.)
For example, the project, Mapping Islamophobia, demonstrates how geospatial data can be combined with other data points (i.e., date, gender, and type of incident). Collectively, the data brings to light, from a geospatial perspective, trends in hostility and hate toward Muslim Americans.
The following represent questions that would benefit from a spatial data-oriented analysis and DS mapping methods.
How has the spatial patterning or characteristics of Hmong immigration to Minneapolis–Saint Paul changed over time?
Where in the city has the increased number of health clinics had the greatest impact on reducing diabetes-based illnesses? How does this data correlate with race and economic status?
How do characters in War and Peace move around a physical environment over the course of the book, and how does this connect with larger themes?
How are historical monuments clustered in downtown Boston in comparison to other major cities?
How can we visualize the movement of Jesuit missionaries travel over the course of the 18th century?
The following are examples of data-related projects that highlight. Note more complex projects, especially ones with custom platforms, are grant or institutionally funded, which enables scholars to create more robust public-facing works.
- a data visualization project that uses the tool
- a network visualization project that uses the tool .
- a mapping project that uses the tools , , and
Also see, for more projects
- a visualization project shows the historical immigration to the U.S. (1830 -2015)
- A visualization project that displays fertility rate, life expectancy, and population of countries in six world regions using
- brings technologists, government, and communities to rapidly prototype digital products—powered by federal open data—that solve real-world problems for communities across the country.
- utilizing Facebook data, and featuring on providing insights on the topics including social connections, relative wealth, COVID-19 impact, climate change, etc.
- an international, collaborative research program whose goal was the complete mapping and understanding of all the genes of human beings
- a scientific collaboration of international physics institutes and research groups dedicated to the search for gravitational waves
- a BC science project that shows how much the ground moves in Weston, Massachusetts
- a science project for showcasing astronomical data and knowledge
Data can be in three different forms: unstructured, semi-structured, and structured.
Unstructured data is, essentially, a bucket of content or data points that are not organized and categorized. A folder full of images and digitized texts are a form of unstructured data. (In both cases, steps can be taken to structure them, however.)
Text files: such as word documents, PDFs, TXT files
Multimedia content: image files, such as TIFF, JPEG, audio/video files. such MP3, MP4
Qualitative data: such as survey responses, interview transcripts
Semi-structured data lies midway between structured and unstructured data. It doesn't have a specific relational or tabular data model but includes tags and semantic markers that scale data into records and fields in a dataset. Common examples of semi-structured data are JSON and XML.
The following is an example of semi-structured data using JSON. The data describes an author's work.
Structured Data is data that is organized and categorized so that it can be more effectively analyzed, in particular by tools like databases and data visualization applications.
Understanding a little about structured data provides a lot of insight into how data works in various data tools. Data is structured in a tabular form (spreadsheets) or tables created using coding and markup languages. For the sake of simplicity, we will look at structured data through the lens of tabular data.
Tabular data, what we think of as spreadsheets, is structured data organized in rows. Rows represent a record (or unit of analysis) and each column represents a different attribute (also referred to as a variable or field).
An attribute describes everything that falls within it or, in this case, underneath it. Think of it like tagging. Everything in a column is tagged by the attribute. Each horizontal line is a row, and a single row makes up what is called a record, meaning a series of data points that go together.
To put tabular data or a spreadsheet into a more relatable context, here is an imaginary DMV database spreadsheet.
Notice that each data point falls under the appropriate attribute and each row represents a single driver's license (a record). Also notice that none of the driver's license numbers repeat. These are unique identifiers that help distinguish records from one another when information is the same or very similar. Moreover, the unique identifier is a datapoint by which the record can be searched.
As shown in the example, structured data is highly organized and easily understood by machine language. Those working within relational databases can input, search, and manipulate structured data relatively quickly using a relational database management system (RDBMS).
Before starting the exhibit creation process, you need to closely consider your topic, desired effects, and objectives with the understanding that these decisions might change as you progress.
1.) Determine the Topic
What is the main focus or theme of the exhibit?
Some examples: a historical period or movement, an event, a person's biography, a process or technique (e.g., silk screen printing), an idea or concept (e.g., the law of gravity), an industry (e.g., whaling), a single object (e.g., a specific book, painting, or musical instrument)
2.) Determine the Desired Effect
Effects to consider (as cited in Barth, et al. 2018):
Aesthetic: organized around the beauty of objects
Emotive: designed to elicit an emotion in the viewer
Evocative: designed to create a specific atmosphere
Didactic: designed to teach the viewer about a specific topic
Entertaining: designed for the amusement or enjoyment of the viewer
3.) Determine the Objectives
What do you want people "walking away" with? This means considering things like:
What is the motivation for creating the exhibit? (Why this exhibit?)
What are the intended learning outcomes?
How do you want visitors to be able to apply what they learn beyond the exhibit?
4.) Creating the Exhibit
Once the above considerations have been made, it is time to begin the exhibit creation process. See our "Getting Started with Digital Exhibits" tutorial for more information.
References
Barth, G. L., Drake Davis, L., Mita, A. (2018). "Digital Exhibitions: Concepts and Practices". Mid-Atlantic Regional Archives Conference Technical Leaflet Series no. 12.
While raster data can be stored in many different formats, GeoTIFF is one of the most common. GeoTIFF is a standard .tif image that also contains spatial data (e.g., coordinates and projections) and attributes. In general, however, any standard image (e.g., jpeg), if given proper spatial data, a practice called georeferencing, can act like a raster file.
Vector data includes data in the form of points, lines, and polygons along with any connected attributes. This type of data can come in several different formats, such as GeoJSON, CSV, and KML. Below you can see an array of data formats (found in Boston Open Data), GeoJason, CSV, KML, Shapefile, ESRI Rest API, and ArcGIS Hub Dataset. All contain the same food truck information:
While most platforms allow for some interoperability (the ability to work with several file formats), it is helpful to understand the basic distinction between them.
A CSV (comma-separated value) file is one of the most well-known formats for both spatial and non-spatial data and is created by using commas to separate each value of an attribute, with each line of the file equal to a single data record. If we return to thinking about structured data sets, comma-separated value file simply puts a comma in between attribute value, returning an output that looks something like this:
CSV data is useful simply because it is easy to create and manipulate in software like Excel or GoogleSheets, and these programs often allow you to export your data as a CSV to make it more interoperable. It is also easy to transform a CSV file into other formats, like GeoJSON, making it an easy "go-to" for sharing spatial data
CSV files are especially useful if you simply have a collection of point data. A file of this kind, sometimes called "XY data" or "long/lat data" can be easily imported into software like ArcGIS or QGIS in order to visualize it. (Note that it will not have any attributes).
GeoJSON is an open standard format for spatial data based on the JSON file format, which includes spatial information (points, lines, polygons) along with other qualitative and quantitative attributes. If you are using Python, a coding language, or platforms like Leaflet to manipulating spatial data, a GeoJSON export is probably your best choice.
The example below is a selection of GeoJSON data. It is organized similarly to the well-known CSV, with each feature containing a series of attributes, including (in this case) the geometry for point data, latitude, and longitude values for where the food truck will be located.
It's worth noting that there are tools for converting CSV to GeoJSON and vice versa.
A KML file (Keyhole-Markup-Language) is an open standard spatial data format most commonly associated with displaying geographic data in an Earth browser such as Google Earth. Although not necessarily as easily compatible with other platforms such as ArcGIS or Leaflet, there are a variety of conversion tools and plugins built-in within these platforms that may help ease the process.
A Shapefile is a simple format for storing the geometric location and attribute information of geographic features represented by points, lines, or polygons (areas), and is most commonly associated with ESRI's ArcGIS program suite. From within ArcGIS/QGIS, you may then export it as other file types, though this sometimes requires purchasing the Interoperability extension.
Introduction to Mapping discusses fundamental concepts that inform spatial data related research, the use of GIS tools, and mapping project creation. While the focus in this section will be on spatial data based maps (i.e., GIS), it is important to note that simple maps can be created in tools like Knighlab StoryMaps, Google MyMaps, and even ArcGIS Online without sophisticated data. In this case, marks are made on maps using things like "pins" and "notes" that can either be added by searching coordinates or manually placing them on a designated image.
Integrating mapping methods into a DS project can offer a variety of helpful advantages in terms of visualization, storytelling, and analysis. Mapping can make complex information and arguments more digestible to the average reader, combining a multitude of information into a single comprehensible figure. It can also help an author tell a story with their data, guiding the viewer around a landscape as the story unfolds with tools such as Knightlab Storymaps. Finally, it opens the door to more specialized spatial analysis through the use of programs like ArcGIS or QGIS, taking advantage of modern computing to reinforce an academic argument to an audience.
There are two types of data, quantitative and qualitative. Generally speaking, when you measure something and give it a number value, you create quantitative data. When you classify or judge something, you create qualitative data. There are also different types of quantitative and qualitative data. (Also see, Qualitative vs Quantitative Data article.)
Qualitative data is used to characterize objects or observations, which can be collected in a non-numerical and non-binary way, such as languages. Qualitative data can include:
Text
Audio and video recordings
Experiment notes, lab reports
Interview transcripts
Two types of qualitative data include categorical, meaning data that can be organized in groups, and ordinal, meaning qualitative data that follows a natural order.
Quantitative data, as the name suggests, relates to the quantity of something, and typical examples of quantitative data are numbers. Quantitative data can include:
Surveys data, including longitudinal and cross-sectional studies
Count frequency
Calculations such as calculating monthly gross margin
Quantification: converting descriptive data to numbers such as satisfaction rating from 1-4
Two types of quantitative data include continuous, meaning numbers that can be made more precise and divided, e.g, a 4.3 earthquake, and discrete, meaning numbers that cannot be divided, e.g., the number of people in a household cannot include a fraction such as 3.5.
Categorical
Ordinal
States (e.g., New York, Massachusetts, Arizona)
Economic class (e..g, lower class, middle class, higher class)
People names (e.g., Matt, Emily, Maria)
Satisfaction scale (e.g., extremely dislike, dislike, neutral, like, extremely like)
Brands (e.g., Coke, Pepsi, Dr. Pepper)
Sports medals (e.g., gold, silver, bronze)
Platform selection will influence decisions about site organization and design, among other things. You may or may not start out knowing what platform you want to use, or you may change your mind about the chosen platform based on the way you want to organize your exhibit, the way you want visitors to interact with it, and whether you want to have the object available in digital collection form as well.
Some considerations that should be made when considering platforms include:
Ease of use and customizability
Whether there needs to be a searchable collections component
Whether exhibits should be navigable linearly, non-linearly, or both
The types and level of interactivity desired
Accessibility for the disabled
Some common platform options:
Omeka (.net and .org) [accessibility statement]
Wordpress (.com and .org) [accessibility statement]
For more information, see Comparing Systems for Creating Digital Exhibits by Dr. Pamella Lach