Module 3: Analyzing text content with natural language processing


Lesson 3.3: Interpreting the results of NLP analysis

AI-aided content analysis of sustainability communication

nils.holmberg@iko.lu.se

Quantitative content analysis

  • Systematically evaluates text features like theme frequency and sentiment.
  • Text-level (e.g., tone) differs from word-level (e.g., keyword counts).
  • Human coders offer flexibility but lack scalability for large datasets.
  • AI-aided coding sacrifices nuance for rapid, scalable analysis.
  • Complements qualitative analysis, validating insights with numerical rigor.

📰

Operationalizing sustainability

  • Distinguishes authentic sustainability from greenwashing via metrics.
  • Authentic communication uses specific, measurable environmental metrics.
  • Greenwashing features vague terms and unverifiable claims.
  • Named entity recognition maps entities and their relationships.
  • Part-of-speech analysis reveals intent through nouns, verbs, adjectives.

♻️

Comparison across organizations

  • Compares Preem (fossil fuel) and Vattenfall (renewable energy).
  • Fossil fuel firms emphasize mitigation; renewables highlight innovation.
  • NLP detects differences in word choice and narrative tone.
  • Public scrutiny shapes fossil fuel firms’ defensive messaging.
  • Quantifies alignment with sustainability goals across sectors.

⚖️

Summarizing results of text analysis

  • Simplifies complex token-level dataframes for better readability.
  • Aggregates metrics like content category counts or sentiment scores.
  • Analyzes dependent variables (e.g., category frequency) against independents.
  • Highlights trends, e.g., adjective use by organization type.
  • Ensures findings are actionable for diverse stakeholders.

📈

Select, filter, aggregate

  • Selects key columns like token entity or part-of-speech tags.
  • Filters out noise, e.g., null values or low-frequency tokens.
  • Aggregates data to compute category counts or sentiment means.
  • Enables precise comparisons, e.g., sustainability terms by organization.
  • Transforms raw data into structured, research-ready insights.

🎯

Visualizing results of text analysis

  • Visualizations make NLP results intuitive compared to tables.
  • Options include bar plots, word clouds, and heatmaps.
  • Simple visuals (e.g., bar plots) are clearer than complex ones.
  • AI tools like Matplotlib streamline visualization processes.
  • Highlights trends, e.g., term frequency differences across firms.

🎨

Stacked bar plots

  • Simple bar plots show single-organization metrics like category frequency.
  • Stacked bar plots compare multiple variables across organizations.
  • Segments represent variables (e.g., word types) within bars.
  • Reveals differences, e.g., adjective use in Preem vs. Vattenfall.
  • Supports multivariate analysis for clear, comparative insights.

📊