Tutorial: Hardware Mapping
The talon.graph library (talon.graph) maps TALON IR graphs onto multi-core neuromorphic hardware. This tutorial covers defining hardware constraints, partitioning graphs across cores, allocating resources, routing inter-core communication, and placing cores on a physical mesh.
What You'll Learn
- Define hardware constraints with
HardwareSpec - Partition a graph across cores
- Allocate resources (SRAM, neurons, synapses)
- Route inter-core communication
- Place cores on a physical mesh
- Use device presets (Zynq-7020, ZCU102)
Prerequisites
pip install t1c-talon
1. HardwareSpec
HardwareSpec defines the constraints of the target hardware — core capacity, memory, fan-in/fan-out limits, and mesh topology.
import numpy as np
from talon import ir, graph as talongraph
spec = talongraph.HardwareSpec(
max_neurons_per_core=1024,
max_synapses_per_core=200000,
sram_bytes_per_core=524288,
num_cores=4,
max_feedback_delay=3,
)
print(f"HardwareSpec:")
print(f" max_neurons_per_core: {spec.max_neurons_per_core}")
print(f" max_synapses_per_core: {spec.max_synapses_per_core:,}")
print(f" sram_bytes_per_core: {spec.sram_bytes_per_core:,}")
print(f" num_cores: {spec.num_cores}")
print(f" max_feedback_delay: {spec.max_feedback_delay}")
Output:
HardwareSpec:
max_neurons_per_core: 1024
max_synapses_per_core: 200,000
sram_bytes_per_core: 524,288
num_cores: 4
max_feedback_delay: 3
Device Presets
TALON includes presets for common FPGA targets:
zynq = talongraph.HardwareSpec.zynq_7020()
print(f"Zynq-7020: cores={zynq.num_cores}, sram={zynq.sram_bytes_per_core:,}")
zcu = talongraph.HardwareSpec.zcu102()
print(f"ZCU102: cores={zcu.num_cores}, sram={zcu.sram_bytes_per_core:,}")
Output:
Zynq-7020: cores=4, sram=131,072
ZCU102: cores=8, sram=524,288
Validation
spec.validate()
print("HardwareSpec validated successfully")
Output:
HardwareSpec validated successfully
2. Build a Test Graph
nodes = {
"input": ir.Input(np.array([784])),
"fc1": ir.Affine(
weight=np.random.randn(64, 784).astype(np.float32) * 0.01,
bias=np.zeros(64, dtype=np.float32),
),
"lif1": ir.LIF(
tau=np.ones(64, dtype=np.float32) * 10.0,
r=np.ones(64, dtype=np.float32),
v_leak=np.zeros(64, dtype=np.float32),
v_threshold=np.ones(64, dtype=np.float32),
),
"fc2": ir.Affine(
weight=np.random.randn(10, 64).astype(np.float32) * 0.01,
bias=np.zeros(10, dtype=np.float32),
),
"lif2": ir.LIF(
tau=np.ones(10, dtype=np.float32) * 10.0,
r=np.ones(10, dtype=np.float32),
v_leak=np.zeros(10, dtype=np.float32),
v_threshold=np.ones(10, dtype=np.float32),
),
"output": ir.Output(np.array([10])),
}
edges = [
("input", "fc1"), ("fc1", "lif1"), ("lif1", "fc2"),
("fc2", "lif2"), ("lif2", "output"),
]
graph = ir.Graph(nodes=nodes, edges=edges)
print(f"Graph: {len(graph.nodes)} nodes, {len(graph.edges)} edges")
Output:
Graph: 6 nodes, 5 edges
3. Partition
Partition assigns each node to a core while respecting SRAM, neuron, and synapse constraints. The function returns a new Graph with partition_metadata attached.
partitioned = talongraph.partition(graph, spec)
pm = partitioned.partition_metadata
print(f"Algorithm: {pm['algorithm']}")
print(f"Cores used: {pm['num_cores_used']}")
print(f"\nAssignments:")
for name, core in pm['assignments'].items():
print(f" {name} -> core {core}")
print(f"\nCross-core edges: {pm['cross_core_edges']}")
print(f"Metrics:")
print(f" Cut ratio: {pm['metrics']['cut_ratio']:.2f}")
print(f" Balance stddev: {pm['metrics']['balance_stddev']:.2f}")
Output:
Algorithm: greedy
Cores used: 1
Assignments:
input -> core 0
fc1 -> core 0
lif1 -> core 0
fc2 -> core 0
lif2 -> core 0
output -> core 0
Cross-core edges: []
Metrics:
Cut ratio: 0.00
Balance stddev: 0.00
Use algorithm="greedy" (default) for fast partitioning, or algorithm="spectral" for graph-cut optimization on larger models.
4. Resource Allocation
After partitioning, allocate checks whether each core's assigned nodes fit within the hardware budget.
res_map = talongraph.allocate(partitioned, spec)
print(f"Resource Allocation:")
print(f" Total weight bytes: {res_map.total_weight_bytes:,}")
print(f" Total state bytes: {res_map.total_state_bytes:,}")
print(f" Peak utilization: {res_map.peak_core_utilization:.2%}")
print(f" Fits hardware: {res_map.fits_hardware}")
if res_map.violations:
print(f" Violations (cores): {res_map.violations}")
for core_id, budget in res_map.budgets.items():
print(f"\n Core {core_id}:")
print(f" Neurons: {budget.neuron_count}")
print(f" Weight bytes: {budget.weight_bytes:,}")
print(f" Utilization: {budget.utilization:.2%}")
Output:
Resource Allocation:
Total weight bytes: 203,560
Total state bytes: 1,184
Peak utilization: 39.05%
Fits hardware: True
Core 0:
Neurons: 4
Weight bytes: 203,560
Utilization: 39.05%
5. Routing
Route computes inter-core communication paths for cross-core edges. For single-core graphs, no routing is needed.
routed = talongraph.route(partitioned, spec)
print(f"Routed graph: {len(routed.nodes)} nodes, {len(routed.edges)} edges")
Output:
Routed graph: 6 nodes, 5 edges
6. Placement
Place maps logical cores onto the physical mesh to minimize total hop distance (Manhattan distance weighted by edge count).
placement = talongraph.place(partitioned, spec)
print(f"Placement:")
print(f" Total hop distance: {placement.total_hop_distance}")
print(f" Improvement over identity: {placement.improvement:.1f}%")
for core_id, pos in placement.logical_to_physical.items():
print(f" Core {core_id} -> row={pos[0]}, col={pos[1]}")
Output:
Placement:
Total hop distance: 0
Improvement over identity: 0.0%
Core 0 -> row=0, col=0
Core 1 -> row=1, col=0
Core 2 -> row=2, col=0
Core 3 -> row=3, col=0
7. End-to-End Pipeline
The full hardware mapping pipeline:
spec = talongraph.HardwareSpec.zcu102()
# 1. Partition
partitioned = talongraph.partition(graph, spec)
pm = partitioned.partition_metadata
print(f"1. Partitioned: {pm['num_cores_used']} cores, algo={pm['algorithm']}")
# 2. Allocate
res = talongraph.allocate(partitioned, spec)
print(f"2. Allocated: fits={res.fits_hardware}, peak_util={res.peak_core_utilization:.1%}")
# 3. Route
routed = talongraph.route(partitioned, spec)
print(f"3. Routed: {len(routed.edges)} edges")
# 4. Place
placed = talongraph.place(partitioned, spec)
print(f"4. Placed: total_hops={placed.total_hop_distance}, improvement={placed.improvement:.1f}%")
Output:
1. Partitioned: 1 cores, algo=greedy
2. Allocated: fits=True, peak_util=39.0%
3. Routed: 5 edges
4. Placed: total_hops=0, improvement=0.0%
API Reference
| Function / Class | Purpose |
|---|---|
HardwareSpec(...) | Define hardware constraints |
HardwareSpec.zynq_7020() | Zynq-7020 preset |
HardwareSpec.zcu102() | ZCU102 (Zynq UltraScale+) preset |
HardwareSpec.t1c_asic_target() | TALON ASIC target preset |
partition(graph, spec) | Assign nodes to cores |
allocate(graph, spec) | Check resource budgets per core |
route(graph, spec) | Compute inter-core routing paths |
place(graph, spec) | Map cores to physical mesh positions |
What's Next
- Tutorial: Backend — Simulate, profile, and compile for FPGA
- Tutorial: Event I/O — Neuromorphic event data handling
- Architecture — Ecosystem architecture overview