Skip to content

Question 3 example 2

⬅️ [Question 4 example 1](<./Question 4 example 1.md>) | ⬆️ [Generated Test Questions](<./README.md>) | [Question 3 example 1](<./Question 3 example 1.md>) ➡️

Problem 3: Conceptual Application - Images as Graphs and Segmentation (25 pts)

You are tasked with designing an interactive tool that automatically separates a user-selected object (foreground) from the rest of the image (background). You decide to use the Max-Flow Min-Cut formulation to solve this two-class segmentation problem.

Based on the concepts of representing images as graphs, answer the following questions:

Part A (8 pts): The energy function used for this segmentation is formulated as $E(x) = E_1(x) + E_2(x)$, which consists of a unary energy term ($E_1$) and a pairwise energy term ($E_2$). Define the conceptual role of both terms. Furthermore, explain the "myopia problem" that occurs if we attempt to segment the image by only minimizing the unary term, and describe how the pairwise term acts as a regularizer to overcome this issue.

Part B (9 pts): Describe the setup of the directed graph used to minimize this energy function. Specifically:
1. What do the standard nodes and the two special nodes (Source and Sink) represent?
2. How are the directed edges structured between the Source, Sink, and pixel nodes, and what determine their capacities?

Part C (8 pts): The Ford-Fulkerson algorithm iteratively finds augmenting paths in a residual network until no more paths exist, at which point the maximum flow is found. According to the Max-Flow Min-Cut theorem, explain conceptually how you extract the final foreground and background segmentation from this residual network once the algorithm finishes.


Answer Key / Grading Rubric:

Part A (8 pts):
Unary Term ($E_1$): Evaluates pixels independently to determine how well they fit the foreground or background features. * Pairwise Term ($E_2$): Enforces spatial continuity and models a Markov Random Field by penalizing neighboring pixels that are assigned different labels.
The Myopia Problem: Minimizing only the unary term is computationally easy but ignores neighboring pixel contexts, leading to noisy and "nearsighted" segmentation results.
Regularization:* The pairwise term fixes the myopia problem by paying a penalty when neighboring pixels differ, with the penalty being higher when the pixels have similar intensities, forcing the model to reconsider placing boundaries between similar pixels.

Part B (9 pts):
Nodes: Every pixel in the image acts as a standard node. The Source node ($S$) represents the foreground, and the Sink node ($T$) represents the background.
Source-to-Pixel Edges: Directed edges connect the Source to the pixels, with capacities based on the unary foreground costs.
Pixel-to-Sink Edges: Directed edges connect the pixels to the Sink, with capacities based on the unary background costs.
Pixel-to-Pixel Edges: Directed edges go in both directions between neighboring pixels, with capacities based on the agreement or similarity (affinity) between them.

Part C (8 pts):
Once the max flow is found, there are no more augmenting paths in the residual network, and the graph is partitioned (the min-cut).
Foreground Extraction: The set of nodes that are still reachable from the Source node in the residual network becomes the foreground segment.
Background Extraction:* The nodes that are unreachable from the Source belong to the background segment.


⬅️ [Question 4 example 1](<./Question 4 example 1.md>) | ⬆️ [Generated Test Questions](<./README.md>) | [Question 3 example 1](<./Question 3 example 1.md>) ➡️