MIT EECS

6.S954 Computer Vision and Planetary Health

Spring 2025


Course Overview

Description: Introduces the growing interdisciplinary intersection of computer vision and planetary health, with a focus on introducing open challenges in CV, and AI more broadly, that limit the deployability of automated approaches for global environmental challenges. Topics include representation learning for imbalanced, fine-grained, and open-set categories, distributional robustness and adaptation, efficiency in training, evaluation, and inference, human-AI collaboration via, e.g., active learning, selective prediction, or active inference, and heterogeneously sampled multimodal learning. Lecture material covers fundamentals and SOTA methods from recent papers. Includes in-class discussion and participation, presentation of papers, and a group final project.

Pre-requisites: 6.8300 or 6.7960 or permission of instructor




Course Information

Instructor Sara Beery

beery at mit dot edu

OH: Mon 2-3pm 45-741H

TF Justin Kay

kayj at mit dot edu

OH: Weds 1:30-2:30pm 45-6 Lounge

- Logistics

- Course structure

This is the first time we are running this course. As such, it is experimental and the exact structure and timing may be subject to change.

- Grading Policy

  • 60% class presentations
    • 10% per role/paper (each student will present six times, twice in each role)
  • 20% class participation
  • 20% final project
  •  



    Class Schedule


    ** Class schedule is subject to change **

    Date Topics Speakers Presented Papers Additional Reading (Optional)
    Week 1
    Tue 02/04 Course overview, introduction to CV in ecology/environment
    [Slides]
    Sara Beery
    Justin Kay
    N/A; Syllabus
    Thu 02/06 History/current use of CV in ecology/environment (cont.) Student Presentations - Biodiversity and AI: Opportunities and recommendations for action (GPAI report)
    - Perspectives in machine learning for wildlife conservation
    - Tackling Climate Change with Machine Learning
    Week 2
    Tue 02/11 Overview of planetary crises: climate change, biodiversity loss, goals (30 by 30)
    [Slides]
    Sara Beery
    Student Presentations
    - The Future of Biodiversity
    - Seven Shortfalls that Beset Knowledge of Biodiversity
    - The Functions of Biological Diversity in an Age of Extinction
    - The Social Costs of Keystone Species Collapse: Evidence from the Decline of Vultures in India
    Thu 02/13 Planetary crises (cont.), needed progress Student Presentations - Getting the measure of biodiversity
    - Darwin Core: An Evolving Community-Developed Biodiversity Data Standard
    - Essential biodiversity variables for mapping species populations
    - Taking stock of nature: Essential biodiversity variables explained
    - Anthropogenic climate and land-use change drive short- and long-term biodiversity shifts across taxa
    Week 3
    Tue 02/18 No class - Monday schedule
    Thu 02/20 Imbalanced, long-tailed, and fine-grained learning
    [Slides]
    Justin Kay
    Student Presentations
    - The iNaturalist Species Classification and Detection Dataset
    - Building a Bird Recognition App and Large Scale Dataset With Citizen Scientists: The Fine Print in Fine-Grained Dataset Collection
    - Caltech-UCSD Birds 200
    - LVIS: A Dataset for Large Vocabulary Instance Segmentation
    Week 4
    Tue 02/25 Imbalanced, long-tailed, and fine-grained learning (cont.) Student Presentations - Class-Balanced Loss Based on Effective Number of Samples
    - Fill-Up: Balancing Long-Tailed Data with Generative Models
    - Balanced Contrastive Learning for Long-Tailed Visual Recognition
    - Focal Loss for Dense Object Detection
    - Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere
    Thu 02/27 Imbalanced, long-tailed, and fine-grained learning (cont.) Student Presentations - FaceNet: A Unified Embedding for Face Recognition and Clustering
    - WildlifeDatasets: An open-source toolkit for animal re-identification
    - ArcFace: Additive Angular Margin Loss for Deep Face Recognition
    - Long-tailed Recognition by Routing Diverse Distribution-Aware Experts
    - Detecting Mammals in UAV Images: Best Practices to address a substantially Imbalanced Dataset with Deep Learning
    Week 5
    Tue 03/04 Open-set learning
    [Slides]
    Sara Beery
    Student Presentations
    - Generalized Out-of-Distribution Detection: A Survey
    - From Coarse to Fine-Grained Open-Set Recognition
    Thu 03/06 Open-set learning (cont.) Student Presentations - Zero-Shot Learning—A Comprehensive Evaluation of the Good, the Bad and the Ugly
    - Labeled Data Selection for Category Discovery
    - Three types of incremental learning
    - Large-Scale Long-Tailed Recognition in an Open World
    Week 6
    Tue 03/11 Distribution shift and distributional robustness
    [Slides]
    Sara Beery
    Student Presentations
    - Taxonomic bias in biodiversity data and societal preferences
    - WILDS: A Benchmark of in-the-Wild Distribution Shifts
    - Recognition in Terra Incognita
    - The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting
    Thu 03/13 Distribution shift and robustness (cont.) Student Presentations - mixup: Beyond Empirical Risk Minimization
    - Spatial Implicit Neural Representations for Global-Scale Species Mapping
    - AutoFT: Learning an Objective for Robust Fine-Tuning
    - TIML: Task-Informed Meta-Learning for Agriculture
    - Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization
    Week 7
    Tue 03/18 Domain adaptation and specialization
    [Slides]
    Sara Beery
    Student Presentations
    - Domain-Adversarial Training of Neural Networks
    - AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation
    Thu 03/20 Domain adaptation and specialization (cont.) Student Presentations - Mean Teachers Are Better Role Models
    - Align and Distill: Unifying and Improving Domain Adaptive Object Detection
    - Global birdsong embeddings enable superior transfer learning for bioacoustic classification
    - A theory of learning from different domains
    Week 8
    Tue 03/25 No class - Spring break
    Tue 03/27 No class - Spring break
    Week 9
    Tue 04/01 Efficiency in training, evaluation, deployment Sara Beery
    Student Presentations
    - A Comprehensive Survey on TinyML
    - Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
    Thu 04/03 Efficiency (cont.) Student Presentations - Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression
    - Distilling the Knowledge in a Neural Network
    - A survey on federated learning: challenges and applications
    Week 10
    Tue 04/08 Human-AI systems - Active learning and selective prediction Sara Beery
    Student Presentations
    - Deep Bayesian Active Learning with Image Data
    - Selective Classification for Deep Neural Networks
    - 21 000 birds in 4.5 h: efficient large-scale seabird detection with machine learning
    - Fast building segmentation from satellite imagery and few local labels
    Thu 04/10 Human-AI - Active/selective (cont.) Student Presentations - A deep active learning system for species identification and counting in camera trap images
    - Active Learning-Based Species Range Estimation
    - Role of Human-AI Interaction in Selective Prediction
    - Investigating Selective Prediction Approaches Across Several Tasks in IID, OOD, and Adversarial Settings
    - Iterative human and automated identification of wildlife images
    - Human-Machine Collaboration for Fast Land Cover Mapping
    Week 11
    Tue 04/15 Human-AI systems - Active inference and decision support Sara Beery
    Student Presentations
    - Prediction-Powered Inference
    - DISCOUNT: Counting in Large Image Collections with Detector-Based Importance Sampling
    - IS-COUNT: Large-scale Object Counting from Satellite Images with Covariate-based Importance Sampling
    Thu 04/17 Human-AI - Inference/decision (cont.) Student Presentations - Active Testing: Sample–Efficient Model Evaluation
    - A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification
    - Deep reinforcement learning for conservation decisions
    - Human-in-the-Loop Visual Re-ID for Population Size Estimation
    Week 12
    Tue 04/22 Multimodality - X and language Sara Beery
    Student Presentations
    - INQUIRE: A Natural World Text-to-Image Retrieval Benchmark
    - Large language models possess some ecological knowledge, but how much?
    - WildCLIP: Scene and animal attribute retrieval from camera trap data with domain-adapted vision-language models
    Thu 04/24 Multimodality - X and language (cont.) Student Presentations - BioCLIP: A Vision Foundation Model for the Tree of Life
    - CLAP: Learning Audio Concepts From Natural Language Supervision
    - TaxaBind: A Unified Embedding Space for Ecological Applications
    Week 13
    Tue 04/29 Multimodality - Remote sensing and ground observation, knowledge-guided learning, ontologies, scientific AI agents Sara Beery
    Student Presentations
    - Mission Critical: Satellite Data is a Distinct Modality in Machine Learning
    - Combining Deep Learning and Street View Imagery to Map Smallholder Crop Types
    - The Auto Arborist Dataset: A Large-Scale Benchmark for Multiview Urban Forest Monitoring Under Domain Shift
    - Integrating remote sensing with ecology and evolution to advance biodiversity conservation
    - Priority list of biodiversity metrics to observe from space
    Thu 05/01 Multimodality smorgasbord cont. Student Presentations - Contrasting local and global modeling with machine learning and satellite data: A case study estimating tree canopy height in African savannas
    - Knowledge-guided Machine Learning: Current Trends and Future Prospects
    - Hierarchical Conditioning of Diffusion Models Using Tree-of-Life for Studying Species Evolution
    - SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery
    - Harnessing machine learning to guide phylogenetic-tree search algorithms
    - Graph embedding and transfer learning can help predict potential species interaction networks despite data limitations
    - Understanding Ecological Systems Using Knowledge Graphs: An Application to Highly Pathogenic Avian Influenza
    Week 14
    Tue 05/06 Guest lecture TBD
    Thu 05/08 Guest lecture TBD
    Week 15
    Tue 05/13 Final project presentations

    Final project assignment details

    The final project in this course will be to propose a new CV-based research project for an ecological application. We are very open to broad interpretations of both of these. You can work in groups of up to 3. We would like to see proposals that are clearly motivated from the ecological side, as well as clearly grounded in the existing AI literature. We take the stance that novelty can take many forms, and are excited to see your proposed versions of “Application-driven innovation”.

    Proposals should be 6-8 pages long and include the following sections:

    1. Introduction:
    2. Related work
    3. Datasets
    4. Method and Experiment Design
    5. Broader Impacts

    Proposals are due end of day May 9th on GradeScope.