📝 This article is a translation of the original Japanese post. View original

Software Design and Graph Theory

This article was created with LLM assistance

When discussing software design, terms like layers, responsibilities, coupling, state transitions, and dependencies come up frequently. Many of these are easier to organize when expressed as graphs.

Graph theory lets you do more than just draw clean diagrams—it gives you mechanical support for detecting circular dependencies, estimating the scope of change, discovering unreachable states, and determining execution order.

This article explores what you gain by viewing software design through the lens of graphs, and walks through a minimal example of dependency verification using diagrams and pseudocode.

Conclusions

Module dependencies, call relationships, state transitions, and data flows can all be treated as graphs
In design reviews, what matters is defining clearly what vertices and edges mean, not just drawing pretty diagrams
For detecting circular dependencies, strongly connected components and topological sort are practically effective
For estimating change impact, reachability is useful; for bottleneck analysis, centrality and cut concepts can apply
However, in implementations heavy on reflection, dynamic imports, or configuration-driven branching, don’t rely solely on static graphs to assert the full picture

Prerequisites

This article assumes no specific language or runtime environment. Code snippets are pseudocode to convey concepts and can be adapted to any language or tool.

Target audience: Those who want to know how to apply graph theory to practical software design reviews and architecture organization
Subject matter: Module dependencies, state transitions, data processing order
Goal: Grasp the design perspective, not implementation techniques

Viewing Software Design as a Graph

A graph is an abstract representation of vertices and edges. Applied to design, vertices represent modules, classes, services, states, jobs, etc., and edges represent relationships like dependencies, transitions, communication, references, or data flow.

For example:

Design Subject	Vertices	Edges	Common Analysis
Module dependencies	Packages, libraries, services	import, reference, usage	DAG, strongly connected components, topological sort
Call relationships	Functions, methods, endpoints	call, invoke	Reachability, dominance, impact scope
State transitions	States	Transition conditions	Unreachable states, cycles, terminal states, automata
Data pipelines	Jobs, topics, tables	I/O, dependencies	Execution order, recomputation scope, bottlenecks
System architecture	Services, nodes, subnets	Communication paths	Cuts, path length, single points of failure

In practice, many problems become visible once you draw them as a graph. For example: “a module that should be in a lower layer is referencing the upper UI layer” or “the state transition diagram has states with no way back.”

A Small Dependency Graph Example

Writing dependency relationships as a directed graph makes design assumptions much clearer.

flowchart LR ui[UI] --> app[Application] app --> domain[Domain] app --> auth[Auth] auth --> infra[Infrastructure] domain --> infra

What matters here is not the appearance but “what does the direction of an edge mean?” Here A --> B is defined as “A depends on B.” If you leave the direction ambiguous, both analysis results and discussions become unstable.

Representative Scenarios Where Graph Theory Helps

1. Finding Circular Dependencies

In a layered architecture you want to keep as a DAG, the presence or absence of cycles is a good approximation of design health.

You want to determine build order
You want to detect layer violations
You want to check if package responsibilities have broken down

In this case, check whether topological sort can produce an order, and whether computing strongly connected components reveals clusters of multiple vertices.

About Topological Sort

Topological sort is a way to arrange vertices without violating dependency relationships. If A depends on B, then B appears before A in the sorted order. In other words, it mechanically arranges “what needs to be ready before what.” That way of thinking makes it clear.

In practice, it applies directly to module build order, batch job execution order, table creation order, and screen initialization order. The key point is that topological sort only works when there are no cycles. If there’s a cycle like app -> auth -> infra -> app, you get a state where “no matter which you start with, something else needs to come first,” making it impossible to determine a unique order.

For example, consider dividing an internal system into 4 layers: UI -> Application -> Domain -> Infrastructure. UI uses Application, Application uses Domain, and Domain uses Infrastructure. If the topological sort result comes out as Infrastructure, Domain, Application, UI, it aligns with the design intent of “build the lower layers first, then stack the upper layers on top.”

Conversely, if on some day Infrastructure starts referencing Application, a cycle forms between the two. This change—even if it’s just one new import—is a sign that “a lower layer has come to know about a higher layer’s concerns.” Topological sort makes it easy to surface that breakdown early, as a build error or CI failure.

2. Estimating Change Impact Scope

When a module changes, which components it affects can be organized by following edges in reverse. By reversing the dependency graph and enumerating vertices reachable from the changed node, you can narrow down review targets and test candidates.

However, in implementations with strong dependencies on configuration files, plugins, or dynamic loading, static analysis alone may miss parts of the picture.

3. Finding State Machine Defects

In workflow and job control, viewing state transition diagrams as graphs makes it easier to find these defects:

Unreachable states
Transitions with no way back on failure
Unending cycles
Mixed transitions that should be mutually exclusive

Don’t just draw a state diagram in a design doc and call it done—cross-check which states can transition where against code and tests to reduce operational incidents.

4. Considering Bottlenecks and Single Points of Failure

In distributed systems, visualizing as a graph whether all communication concentrates on one service or queue makes it easy to find single points of failure and over-connected hubs. Even without rigorous mathematical optimization, just looking at in-degree/out-degree imbalances is often useful.

Minimal Example with Diagrams and Pseudocode

Here we represent module dependencies as a simple directed graph and verify the presence of circular dependencies and execution order. Since it’s language-independent, you can just read along with the diagrams and pseudocode.

1. Verify Order with a Valid Dependency Graph

First, drawing the dependencies in Mermaid looks like this:

flowchart LR ui[UI] --> app[Application] app --> domain[Domain] app --> auth[Auth] auth --> infra[Infrastructure] domain --> infra

Running topological sort on this graph in pseudocode looks like:

dependencies = {
  UI: [Application],
  Application: [Domain, Auth],
  Auth: [Infrastructure],
  Domain: [Infrastructure],
  Infrastructure: []
}

order = topological_sort(dependencies)
print(order)

The result will be something like Infrastructure, Auth, Domain, Application, UI. The order of Auth and Domain may swap, but both must come after Infrastructure and before Application. This result aligns with the design intent of “build lower layers first, then stack upper layers on top.”

2. Introduce a Cycle and Confirm Failure

Next, add a Infrastructure -> Application dependency to deliberately create a cycle.

flowchart LR ui[UI] --> app[Application] app --> domain[Domain] app --> auth[Auth] auth --> infra[Infrastructure] domain --> infra infra --> app

In pseudocode, just add one dependency:

dependencies = {
  UI: [Application],
  Application: [Domain, Auth],
  Auth: [Infrastructure],
  Domain: [Infrastructure],
  Infrastructure: [Application]
}

order = topological_sort(dependencies)
if order cannot be built:
  print("cycle detected")

In this case, because there is a cycle Application -> Auth -> Infrastructure -> Application, the order cannot be fully determined. In practice, putting this kind of check in CI makes it easy to surface design breakdowns before review.

3. View Change Impact with a Reversed Graph

When a change is made to a module, you can also trace how far the impact scope should extend with a diagram. Tracing Infrastructure in reverse, impact candidates expand as follows:

flowchart RL infra[Infrastructure] --> auth[Auth] infra --> domain[Domain] auth --> app[Application] domain --> app app --> ui[UI]

The thinking in pseudocode:

reverse_graph = reverse_edges(dependencies)
affected = reachable_nodes(from = Infrastructure, graph = reverse_graph)
print(affected)

The result will be something like Infrastructure, Auth, Domain, Application, UI. This indicates that a change to Infrastructure may potentially require checking the entire upper-layer stack. Of course, actual impact depends on public API compatibility and runtime execution paths, but it’s sufficient as an initial estimate in a design review.

Practical Perspective When Applying to Real Work

When bringing graph theory into design, it’s more practical to think in the following order rather than memorizing difficult algorithms:

Decide what the vertices are
Define what each edge means, one type at a time
Decide what properties you expect
Check for patterns that violate those properties

For example, “I want to enforce a layered architecture” means the expected properties are “no circular dependencies” and “no references from lower layers to upper layers.” What you need here is not a massive diagram, but dependency extraction rules and violation detection.

Conversely, “I want to know the cause of processing delay” may require a weighted graph or queuing model. At that point, a plain dependency graph isn’t enough. In other words, graphing is not universal—you need to choose the model to match the property you want to know.

Caveats

Static dependency graphs may miss paths through reflection, dynamic imports, or configuration-driven routing
Mixing multiple meanings in a single edge makes analysis results ambiguous
Visualizing a massive graph as-is is hard for humans to read; it’s better to start with a coarse graph aggregated at the responsibility level
Don’t equate proximity on a graph with high change risk; also look separately at public API stability and operational importance
Mathematical correctness and team design judgment don’t always align; it’s safer to use analysis results as material for conversation

Troubleshooting

If topological sort fails, trace not just part of the cycle but “why that dependency was needed” all the way back
If the visualized diagram is too complex, increase granularity from package level to bounded context level
If change impact is too broad, check whether too many responsibilities have concentrated in shared infrastructure
If static analysis results differ from runtime behavior, treat configuration files, plugins, and generated code paths as a separate graph
If putting it in CI generates too much noise, first stabilize the inspection rules by targeting only specific directories

References

Dijkstra, E. W. “A note on two problems in connexion with graphs”. Numerische Mathematik, 1959
Kahn, A. B. “Topological sorting of large networks”. Communications of the ACM, 1962
Tarjan, R. E. “Depth-first search and linear graph algorithms”. SIAM Journal on Computing, 1972
R. C. Martin’s discussions on dependency relationships and layering are easier to organize when viewed as graphs. However, during implementation, also check language spec and build tool constraints

Software Design and Graph Theory

Conclusions

Prerequisites

Viewing Software Design as a Graph

A Small Dependency Graph Example

Representative Scenarios Where Graph Theory Helps

1. Finding Circular Dependencies

About Topological Sort

2. Estimating Change Impact Scope

3. Finding State Machine Defects

4. Considering Bottlenecks and Single Points of Failure

Minimal Example with Diagrams and Pseudocode

1. Verify Order with a Valid Dependency Graph

2. Introduce a Cycle and Confirm Failure

3. View Change Impact with a Reversed Graph

Practical Perspective When Applying to Real Work

Caveats

Troubleshooting

References

Share with

See Also