Exercise: Distant Reading with and for AI

This week, we’re going to revisit the distant reading we did earlier this semester, but now we’ll be approaching the procedural aspects of the project more directly with agentic AI assistance. We’ll revisit collecting, processing, and analyzing a dataset of texts, but now we can work at a much larger scale with access to libraries of existing code. We’ll primarily be making use of a few Python libraries:

You might find it helpful to look at documentation of these libraries, or even web scraping and distant reading tutorials in Python, for ideas of things to try. While you can install Python directly on your machine to complete these tasks, my demos will be using Google Colab, a free service for deploying code in different environments - more extensive usage does require paying, but you should have no problems completing these tasks at the free usage level.

Working with the Colab Notebook

You’ll be using a pre-built Google Colab notebook and working with Python libraries like Pandas, Matplotlib, and NLTK. Before you start, choose at least five novels for your analysis and download them as .txt files from Project Gutenberg.

Starter Notebook Link: Distant Reading

Run through every cell in the notebook to complete the basic workflow, then use Gemini (Google Colab’s AI agent) to extend your analysis. Once you’ve completed the basic workflow, ask Gemini to help you create additional analysis. Consider approaches like:

These visualizations can either work with one specific novel or text (as shown in the last cell of the demo, the character network, which will vary in quality depending on how your text is structured) or across all of them.

Discussion

As you work through the analysis, consider how the relationship with text in distant reading connects to the other ways we’ve been working with AI (both in text and code) throughout the semester. Share your most interesting findings from your distant reading analysis, including your documented visualizations. What did the computational approach reveal about your chosen novels? How did working with an AI agent change your analytical process compared to our earlier distant reading experiments? What are the benefits and limitations of using agentic AI for literary analysis?

Refer back to this week’s readings on algorithms and variables, and connect your experience to broader questions about digital humanities, the relationship between human and machine reading, and how AI tools are changing the perception of “reading.”