CUDA Backend

The CUDA backend provides GPU-accelerated implementations of all core functionality for topology optimization.

CUDA backend public API.

Importing from this module gives access to GPU/CUDA implementations of core pyFANTOM components. The API mirrors the CPU backend so users can switch backends by changing the import path.

>>> from pyFANTOM.CUDA import StructuredMesh2D, FiniteElement, MinimumCompliance

Mesh Classes

Filter Classes

Finite Element Classes

Stiffness Kernel Classes

Solver Classes

Optimizer Classes

Problem Classes

Detailed Documentation

class pyFANTOM.geom.CUDA._mesh.CuStructuredMesh2D(nx, ny, lx, ly, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]

Bases: StructuredMesh

CUDA-accelerated 2D structured mesh with uniform rectangular elements.

GPU version of StructuredMesh2D using CuPy arrays for all data storage. Identical API to CPU version but operates on GPU memory for efficient assembly and computation.

Parameters:
  • nx (int) – Number of elements in x-direction

  • ny (int) – Number of elements in y-direction

  • lx (float) – Physical length of domain in x-direction

  • ly (float) – Physical length of domain in y-direction

  • dtype (np.dtype, optional) – Data type for arrays (default: np.float64)

  • physics (Physx, optional) – Physics model defining material behavior (default: LinearElasticity(E=1.0, nu=1/3))

Notes

  • All arrays are stored as CuPy arrays on GPU

  • Element stiffness matrix is computed on CPU then transferred to GPU

  • Use with CuStructuredStiffnessKernel for GPU-accelerated assembly

  • Requires CUDA-capable GPU and CuPy installation

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D
>>> from pyFANTOM import LinearElasticity
>>> mesh = StructuredMesh2D(nx=128, ny=64, lx=2.0, ly=1.0)
>>> print(f"GPU memory usage: {mesh.nodes.nbytes + mesh.elements.nbytes} bytes")
__init__(nx, ny, lx, ly, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]
class pyFANTOM.geom.CUDA._mesh.CuStructuredMesh3D(nx, ny, nz, lx, ly, lz, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]

Bases: StructuredMesh

CUDA-accelerated 3D structured mesh with uniform hexahedral elements.

GPU version of StructuredMesh3D using CuPy arrays for all data storage. Identical API to CPU version but operates on GPU memory for efficient 3D topology optimization.

Parameters:
  • nx (int) – Number of elements in x-direction

  • ny (int) – Number of elements in y-direction

  • nz (int) – Number of elements in z-direction

  • lx (float) – Physical length of domain in x-direction

  • ly (float) – Physical length of domain in y-direction

  • lz (float) – Physical length of domain in z-direction

  • dtype (np.dtype, optional) – Data type for arrays (default: np.float64)

  • physics (Physx, optional) – Physics model defining material behavior (default: LinearElasticity(E=1.0, nu=1/3))

Notes

  • All arrays are stored as CuPy arrays on GPU

  • Use MultiGrid solver for large 3D problems to manage GPU memory

  • Element stiffness matrix computed on CPU then transferred

  • Requires sufficient GPU memory for 3D problems (typically >8GB for 100^3 elements)

Examples

>>> from pyFANTOM.CUDA import StructuredMesh3D
>>> mesh = StructuredMesh3D(nx=64, ny=64, nz=32, lx=1.0, ly=1.0, lz=0.5)
>>> print(f"Total DOF: {len(mesh.nodes) * mesh.dof}")
__init__(nx, ny, nz, lx, ly, lz, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]
class pyFANTOM.geom.CUDA._mesh.CuGeneralMesh(nodes, elements, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]

Bases: GeneralMesh

CUDA-accelerated general unstructured mesh.

GPU version of GeneralMesh. Inherits CPU mesh initialization logic then transfers all arrays to GPU memory. Supports both uniform and heterogeneous element topologies.

Parameters:
  • nodes (ndarray) – Node coordinates (CPU array), shape (n_nodes, spatial_dim)

  • elements (list or ndarray) – Element connectivity (CPU array or list)

  • dtype (np.dtype, optional) – Data type for arrays (default: np.float64)

  • physics (Physx, optional) – Physics model (default: LinearElasticity(E=1.0, nu=1/3))

Notes

  • Mesh processing (node cleanup, stiffness computation) happens on CPU

  • Resulting arrays are transferred to GPU after initialization

  • Use CuGeneralStiffnessKernel for GPU-accelerated assembly

  • For large unstructured meshes, GPU offers significant speedup over CPU

Examples

>>> from pyFANTOM.CUDA import GeneralMesh
>>> import numpy as np
>>> nodes = np.array([[0,0], [1,0], [0,1], [1,1]])
>>> elements = np.array([[0,1,2], [1,3,2]])
>>> mesh = GeneralMesh(nodes, elements)
__init__(nodes, elements, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]
class pyFANTOM.geom.CUDA._filters.CuStructuredFilter2D(mesh: CuStructuredMesh2D, r_min)[source]

Bases: FilterKernel

CUDA-accelerated 2D density filter for structured meshes.

GPU version of StructuredFilter2D using CuPy. Fast convolution-based filtering on GPU. Identical API to CPU version but operates on GPU memory for maximum performance.

Parameters:
shape

Filter matrix dimensions (n_elements, n_elements)

Type:

tuple

weights

Precomputed filter weights on GPU

Type:

cupy.ndarray

offsets

Element index offsets for neighbors on GPU

Type:

cupy.ndarray

normalizer

Per-element normalization for adjoint on GPU

Type:

cupy.ndarray

dot(rho)

Forward filter application

_rmatvec(sens)[source]

Adjoint filter for sensitivity backpropagation

Notes

  • Uses cone filter: w(d) = (r_min - d) / r_min

  • Handles non-square elements (dx != dy) automatically

  • Typical r_min values: 1.5-3.0 element units

  • Larger r_min produces smoother, coarser designs

  • All arrays stored on GPU as CuPy arrays

  • GPU-accelerated filtering for maximum performance

  • Requires CUDA-capable GPU and CuPy

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredFilter2D
>>> mesh = StructuredMesh2D(nx=128, ny=64, lx=2.0, ly=1.0)
>>> filter = StructuredFilter2D(mesh=mesh, r_min=2.0)
>>> rho_smooth = filter.dot(rho)
__init__(mesh: CuStructuredMesh2D, r_min)[source]
class pyFANTOM.geom.CUDA._filters.CuStructuredFilter3D(mesh: CuStructuredMesh3D, r_min)[source]

Bases: FilterKernel

CUDA-accelerated 3D density filter for structured meshes.

GPU version of StructuredFilter3D using CuPy. All filtering operations on GPU for maximum performance in 3D topology optimization. Identical API to CPU version but operates on GPU memory.

Parameters:
  • mesh (CuStructuredMesh3D) – CUDA 3D structured mesh

  • r_min (float) – Filter radius in element units (e.g., r_min=1.5 includes elements within 1.5 element widths)

shape

Filter matrix dimensions (n_elements, n_elements)

Type:

tuple

weights

Precomputed filter weights for neighbor elements on GPU

Type:

cupy.ndarray

offsets

Element index offsets for neighbors within filter radius on GPU

Type:

cupy.ndarray

normalizer

Per-element normalization factors for adjoint operation on GPU

Type:

cupy.ndarray

dot(rho)

Apply forward filter: filtered_rho = H @ rho

_rmatvec(sens)[source]

Apply adjoint filter for sensitivity backpropagation: H^T @ sens

Notes

  • Filter uses cone-shaped kernel: w(d) = (r_min - d) / r_min for d < r_min

  • Automatically handles anisotropic elements (dx != dy != dz)

  • Adjoint includes proper normalization for optimization gradients

  • Memory-efficient: stores only unique weights and offsets, not full matrix

  • All arrays stored on GPU as CuPy arrays

  • 5-10x faster than CPU for large 3D problems

  • Essential for large-scale 3D optimization

  • Requires CUDA-capable GPU and CuPy

Examples

>>> from pyFANTOM.CUDA import StructuredMesh3D, StructuredFilter3D
>>> mesh = StructuredMesh3D(nx=64, ny=64, nz=32, lx=1.0, ly=1.0, lz=0.5)
>>> filter = StructuredFilter3D(mesh=mesh, r_min=1.5)
>>> rho_filtered = filter.dot(rho_raw)
__init__(mesh: CuStructuredMesh3D, r_min)[source]
class pyFANTOM.geom.CUDA._filters.CuGeneralFilter(mesh: CuGeneralMesh, r_min)[source]

Bases: FilterKernel

CUDA-accelerated density filter for unstructured meshes.

GPU version of GeneralFilter using CuPy sparse matrices. Implements density filtering for general unstructured meshes using spatial search. Stores explicit sparse filter matrix.

Parameters:
  • mesh (CuGeneralMesh) – CUDA-accelerated general unstructured mesh

  • r_min (float) – Filter radius in physical units (not element units)

kernel

Explicit filter matrix stored on GPU, shape (n_elements, n_elements)

Type:

cupyx.scipy.sparse.csr_matrix

shape

Filter matrix dimensions

Type:

tuple

dot(rho)

Forward filter: H @ rho

_rmatvec(sens)[source]

Adjoint filter: H^T @ sens

Notes

  • Uses spatial search (KDTree on CPU) to build filter, then transfers to GPU

  • Stores explicit sparse matrix (memory-intensive for large meshes)

  • r_min is in physical units, not element-relative

  • Automatically detects 2D vs 3D from mesh

  • GPU acceleration provides 3-5x speedup over CPU for large meshes

Examples

>>> from pyFANTOM.CUDA import GeneralMesh, GeneralFilter
>>> mesh = GeneralMesh(nodes, elements)  # Triangular mesh
>>> filter = GeneralFilter(mesh=mesh, r_min=0.05)  # Physical radius
>>> rho_filtered = filter.dot(rho)
__init__(mesh: CuGeneralMesh, r_min)[source]
class pyFANTOM.FiniteElement.CUDA.FiniteElement.FiniteElement(mesh: CuStructuredMesh2D | CuStructuredMesh3D | CuGeneralMesh, kernel: StructuredStiffnessKernel | GeneralStiffnessKernel | UniformStiffnessKernel, solver: CG | GMRES | SPSOLVE | MultiGrid)[source]

Bases: FiniteElement

CUDA-accelerated finite element analysis engine.

GPU version of FiniteElement using CuPy for all computations. Provides identical API to CPU version but with 5-10x performance improvement for large problems. All arrays stored on GPU memory.

Parameters:
mesh

Associated CUDA mesh

Type:

Mesh

kernel

Stiffness assembly kernel

Type:

StiffnessKernel

solver

Linear solver

Type:

Solver

rhs

Right-hand side force vector on GPU, shape (n_nodes * dof,)

Type:

cupy.ndarray

d_rhs

Prescribed displacement values for Dirichlet BCs, shape (n_nodes * dof,)

Type:

cupy.ndarray

dof

Degrees of freedom per node (2 for 2D, 3 for 3D)

Type:

int

is_3D

True for 3D problems

Type:

bool

nel

Number of elements

Type:

int

Notes

  • All arrays stored as CuPy arrays on GPU

  • KDTree searches performed on CPU (temporary transfer)

  • Visualization methods transfer data to CPU automatically

  • Requires CUDA-capable GPU and CuPy

Examples

>>> from pyFANTOM.CUDA import *
>>> mesh = StructuredMesh2D(nx=256, ny=128, lx=2.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = CG(kernel=kernel)
>>> FE = FiniteElement(mesh=mesh, kernel=kernel, solver=solver)
>>>
>>> # Apply BCs and loads
>>> FE.add_dirichlet_boundary_condition(node_ids=fixed_nodes, rhs=0)
>>> FE.add_point_forces(node_ids=load_nodes, forces=cp.array([[0, -1.0]]))
>>>
>>> # Solve
>>> U, residual = FE.solve(rho=cp.ones(nel) * 0.5)
__init__(mesh: CuStructuredMesh2D | CuStructuredMesh3D | CuGeneralMesh, kernel: StructuredStiffnessKernel | GeneralStiffnessKernel | UniformStiffnessKernel, solver: CG | GMRES | SPSOLVE | MultiGrid)[source]

Initialize CUDA finite element analysis engine.

Parameters:

Notes

Initializes force vector (rhs) and Dirichlet BC storage (d_rhs) on GPU. All arrays remain on GPU throughout computation.

add_dirichlet_boundary_condition(node_ids: cupy.ndarray | None = None, positions: cupy.ndarray | None = None, dofs: cupy.ndarray | None = None, rhs: float | cupy.ndarray = 0.0)[source]

Apply Dirichlet (fixed displacement) boundary conditions.

Parameters:
  • node_ids (cp.ndarray, optional) – Node indices to constrain on GPU, shape (n_nodes,)

  • positions (cp.ndarray, optional) – Physical coordinates to constrain (uses KDTree search on CPU), shape (n_nodes, spatial_dim)

  • dofs (cp.ndarray, optional) – DOF mask per node, shape (n_nodes, dof) with 1=constrained, 0=free. If None, all DOFs at specified nodes are constrained

  • rhs (float or cp.ndarray, optional) – Prescribed displacement values. Scalar for zero displacement, or array shape (n_nodes, dof)

Notes

  • Provide either node_ids OR positions, not both

  • dofs array: [[1,0]] constrains only x-displacement in 2D

  • Multiple calls accumulate constraints

  • Modifies kernel.constraints boolean array on GPU

  • KDTree search performed on CPU (nodes transferred temporarily)

add_neumann_boundary_condition(**kwargs)[source]

Apply Neumann boundary conditions (not implemented).

Raises:

NotImplementedError – Always raised. Neumann boundary conditions not yet implemented for CUDA FEA.

add_point_forces(forces: cupy.ndarray, node_ids: cupy.ndarray | None = None, positions: cupy.ndarray | None = None)[source]

Apply point loads to specified nodes.

Parameters:
  • forces (cp.ndarray) – Force vectors on GPU, shape (n_forces, dof) where dof is 2 for 2D or 3 for 3D. For 2D: [[fx, fy]], for 3D: [[fx, fy, fz]]

  • node_ids (cp.ndarray, optional) – Node indices for force application on GPU, shape (n_forces,)

  • positions (cp.ndarray, optional) – Physical coordinates for force application (uses KDTree search on CPU)

Notes

  • Provide either node_ids OR positions

  • Multiple calls accumulate forces

  • Forces are automatically set to prescribed values at Dirichlet nodes

  • Units should match physics model (e.g., Newtons for SI units)

  • KDTree search performed on CPU (nodes transferred temporarily)

reset_forces()[source]

Clear all applied forces.

Sets the right-hand side force vector to zero on GPU. Useful when reconfiguring loading conditions or starting a new analysis.

reset_dirichlet_boundary_conditions()[source]

Remove all Dirichlet boundary conditions.

Clears all fixed displacement constraints and resets the kernel’s constraint state. Useful when reconfiguring boundary conditions.

visualize_problem(ax=None, **kwargs)[source]

Visualize mesh with boundary conditions and loads.

Parameters:
  • ax (matplotlib.axes.Axes, optional) – Existing axes to plot on (2D only). If None, creates new figure

  • **kwargs – Additional arguments passed to plot_problem_2D/3D

Returns:

Plot object (2D: matplotlib axes, 3D: k3d plot)

Return type:

matplotlib.axes.Axes or k3d.Plot

Notes

  • GPU arrays are transferred to CPU for visualization (.get() calls)

  • Shows mesh elements, fixed nodes, and applied forces

visualize_density(rho, ax=None, **kwargs)[source]

Visualize density distribution (optimization result).

Parameters:
  • rho (cp.ndarray or ndarray) – Element densities on GPU or CPU, shape (n_elements,) or (n_elements, n_materials)

  • ax (matplotlib.axes.Axes, optional) – Existing axes to plot on (2D only). If None, creates new figure

  • **kwargs – Additional arguments passed to plot_problem_2D/3D

Returns:

Plot object (2D: matplotlib axes, 3D: k3d plot)

Return type:

matplotlib.axes.Axes or k3d.Plot

Notes

  • GPU arrays are automatically transferred to CPU for visualization

  • Color-codes elements by density value (0=void, 1=solid)

visualize_field(field, ax=None, rho=None, **kwargs)[source]

Visualize scalar or vector field (displacement, stress, strain, etc.).

Parameters:
  • field (cp.ndarray or ndarray) – Field values to plot on GPU or CPU. Shape depends on field type: - Scalar: (n_elements,) or (n_nodes,) - Vector: (n_nodes, dof) for displacement-like fields

  • ax (matplotlib.axes.Axes, optional) – Existing axes to plot on (2D only). If None, creates new figure

  • rho (cp.ndarray or ndarray, optional) – Density mask to hide void regions, shape (n_elements,)

  • **kwargs – Additional arguments passed to plot_field_2D/3D

Returns:

Plot object (2D: matplotlib axes, 3D: k3d plot)

Return type:

matplotlib.axes.Axes or k3d.Plot

Notes

  • GPU arrays are automatically transferred to CPU for visualization

  • Common fields: displacement, Von Mises stress, strain energy

  • If rho provided, elements with rho < 0.5 are hidden

solve(rho=None)[source]

Solve finite element system: K(rho) @ U = F on GPU.

Parameters:

rho (cp.ndarray, optional) – Element density variables on GPU, shape (n_elements,). If None, uses rho=1 (full density)

Returns:

  • U (cp.ndarray) – Displacement vector on GPU, shape (n_nodes * dof,). Interleaved DOFs: [ux0, uy0, ux1, uy1, …]

  • residual (float) – Normalized residual ||K@U - F|| / ||F|| for solution validation

Notes

  • residual < 1e-5 indicates accurate solution

  • residual > 1e-2 indicates ill-conditioned system

  • All operations performed on GPU for maximum performance

  • Solver reuses factorization if available

  • For topology optimization, rho represents material distribution

class pyFANTOM.stiffness.CUDA._FEA.StructuredStiffnessKernel(mesh: CuStructuredMesh2D | CuStructuredMesh3D)[source]

Bases: StiffnessKernel

CUDA-accelerated stiffness kernel for structured meshes.

GPU version of CPU StructuredStiffnessKernel using CuPy. Uses node-basis assembly with a single precomputed element stiffness matrix for maximum efficiency on uniform grid meshes. Identical API to CPU version but operates entirely on GPU memory for maximum performance.

Parameters:

mesh (CuStructuredMesh2D or CuStructuredMesh3D) – CUDA-accelerated structured mesh with uniform elements

K_single

Single element stiffness matrix used for all elements (on GPU)

Type:

cupy.ndarray

shape

Global stiffness matrix dimensions (n_dof, n_dof)

Type:

tuple

constraints

Boolean array marking constrained DOFs (on GPU)

Type:

cupy.ndarray (bool)

has_cons

True if boundary conditions have been applied

Type:

bool

non_con_map

Index map to non-constrained DOFs for reduced system (on GPU)

Type:

cupy.ndarray

set_rho(rho)

Set design variables for subsequent matrix-free operations

dot(rhs)[source]

Matrix-vector product K @ rhs (requires rho to be set)

construct(rho)[source]

Build explicit CSR matrix representation

diagonal(rho=None)[source]

Extract diagonal of stiffness matrix

add_constraints(dof_indices)[source]

Apply Dirichlet boundary conditions

process_grad(U)[source]

Compute element-wise sensitivities dK/drho : U for adjoint method

Notes

  • Fastest kernel for structured meshes (2-3x faster than UniformStiffnessKernel on GPU)

  • All arrays stored as CuPy arrays on GPU

  • 5-10x faster than CPU for large 3D problems

  • Automatically handles constrained DOFs by zeroing rows/columns

  • Matrix-free dot() operation uses optimized CUDA kernels

  • Use construct() only when explicit matrix is needed (e.g., for direct solvers)

  • Requires CUDA-capable GPU and CuPy

  • Use with CUDA solvers (CG, MultiGrid)

  • Memory limited by GPU VRAM

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel
>>> mesh = StructuredMesh2D(nx=256, ny=128, lx=2.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> kernel.set_rho(rho)
>>> u = kernel.dot(f)  # Matrix-free K @ f on GPU
__init__(mesh: CuStructuredMesh2D | CuStructuredMesh3D)[source]

Initialize base stiffness kernel.

Sets up attributes for density storage and matrix operation aliases. Subclasses should call super().__init__() and set mesh-specific attributes.

diagonal(rho=None)[source]

Extract diagonal of assembled stiffness matrix.

Parameters:

rho (cp.ndarray, optional) – Design densities. If None, uses stored rho

Returns:

diag – Diagonal entries of K(rho)

Return type:

cp.ndarray

Notes

  • Used for Jacobi preconditioning in iterative solvers

  • Cached on subsequent calls with same rho

set_constraints(constraints)[source]

Set Dirichlet boundary conditions (replaces existing constraints).

Parameters:

constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)

Notes

Clears all existing constraints and sets only the specified DOFs. Use add_constraints() to add constraints incrementally.

add_constraints(constraints)[source]

Add Dirichlet boundary conditions (accumulates with existing constraints).

Parameters:

constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)

Notes

Adds to existing constraints. Use set_constraints() to replace all constraints.

process_grad(U)[source]

Compute element-wise compliance sensitivities dC/drho.

Parameters:

U (cp.ndarray) – Displacement vector from FEA solve on GPU, shape (n_dof,)

Returns:

dk – Element sensitivities on GPU (compliance derivative per element), shape (n_elements,)

Return type:

cp.ndarray

Notes

  • Computes: dk[e] = -U[e]^T @ K_e @ U[e]

  • Used for compliance minimization gradient

  • Negative sign convention for minimization

  • All operations performed on GPU

construct(rho)[source]

Build explicit CSR sparse matrix representation on GPU.

Parameters:

rho (cp.ndarray) – Design variables on GPU, shape (n_elements,)

Returns:

Sparse stiffness matrix in CSR format on GPU

Return type:

cupyx.scipy.sparse.csr_matrix

Notes

  • Memory-intensive: O(nnz) storage on GPU

  • Required for direct solvers (SPSOLVE)

  • Reuses sparsity pattern on subsequent calls (faster)

dot(rhs)[source]

Matrix-free matrix-vector or matrix-matrix product K(rho) @ rhs.

Parameters:

rhs (cp.ndarray or cp.sparse.csr_matrix) – Right-hand side vector or matrix

Returns:

result – Product K(rho) @ rhs

Return type:

cp.ndarray or cp.sparse.csr_matrix

Raises:

Notes

  • Requires rho set via set_rho() first

  • Matrix-free: No explicit matrix storage

  • Faster than constructing full matrix for large problems

class pyFANTOM.stiffness.CUDA._FEA.GeneralStiffnessKernel(mesh: CuGeneralMesh)[source]

Bases: StiffnessKernel

CUDA-accelerated stiffness kernel for heterogeneous meshes.

GPU version of CPU GeneralStiffnessKernel. Most general kernel supporting meshes with elements of varying sizes and topologies. Uses flat storage with pointer arrays for memory-efficient representation. Identical API to CPU version but operates entirely on GPU.

Parameters:

mesh (CuGeneralMesh) – CUDA general mesh with potentially heterogeneous elements (mesh.is_uniform can be False)

K_flat

Flattened element stiffness matrices (on GPU)

Type:

cupy.ndarray

K_ptr

Pointer array for accessing K_flat: K_e = K_flat[K_ptr[e]:K_ptr[e+1]] (on GPU)

Type:

cupy.ndarray

elements_flat

Flattened element connectivity (on GPU)

Type:

cupy.ndarray

elements_ptr

Pointer array for accessing elements (on GPU)

Type:

cupy.ndarray

element_sizes

Number of nodes per element, shape (n_elements,) (on GPU)

Type:

cupy.ndarray

shape

Global stiffness matrix dimensions (n_dof, n_dof)

Type:

tuple

constraints

Boolean array for constrained DOFs (on GPU)

Type:

cupy.ndarray (bool)

has_cons

True if boundary conditions have been applied

Type:

bool

Same interface as StructuredStiffnessKernel (set_rho, dot, construct, diagonal, etc.)

Notes

  • Supports mixed element types (triangles + quads, tets + hexes, etc.)

  • Flat storage with pointers enables variable element sizes

  • Slower than Uniform/Structured kernels due to indirection

  • Automatically used if mesh.is_uniform == False

  • All arrays stored as CuPy arrays on GPU

  • All operations on GPU using CuPy

  • Use with CUDA solvers (CG, MultiGrid) for best performance

  • Memory limited by GPU VRAM

Examples

>>> from pyFANTOM.CUDA import GeneralMesh, GeneralStiffnessKernel
>>> import numpy as np
>>> # Mixed triangular and quad mesh
>>> nodes = np.array([[0,0], [1,0], [0,1], [1,1], [2,0]])
>>> elements = [np.array([0,1,2]), np.array([1,4,3,2])]  # Triangle + Quad
>>> mesh = GeneralMesh(nodes, elements)
>>> kernel = GeneralStiffnessKernel(mesh=mesh)
__init__(mesh: CuGeneralMesh)[source]

Initialize base stiffness kernel.

Sets up attributes for density storage and matrix operation aliases. Subclasses should call super().__init__() and set mesh-specific attributes.

diagonal(rho=None)[source]

Compute diagonal entries of stiffness matrix.

Parameters:

rho (cupy.ndarray, optional) – Element density variables, shape (n_elements,). If None, uses cached self.rho

Returns:

Diagonal entries of K(rho), shape (n_dof,)

Return type:

cupy.ndarray

Notes

  • Uses GPU-accelerated flat storage assembly

  • Respects Dirichlet boundary conditions (constrained DOFs zeroed)

  • Caches result in self.diag for reuse

  • Used for preconditioning and diagnostics

set_constraints(constraints)[source]

Set Dirichlet boundary condition DOF (replaces existing constraints).

Parameters:

constraints (cp.ndarray or array-like) – Indices of constrained DOF

Notes

  • Resets all previous constraints

  • Use add_constraints() to append to existing constraints

  • Constrained rows/columns are zeroed in matrix operations

add_constraints(constraints)[source]

Add Dirichlet boundary conditions (accumulates with existing constraints).

Parameters:

constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)

Notes

Adds to existing constraints. Use set_constraints() to replace all constraints. - Use set_constraints() to replace all constraints

process_grad(U)[source]

Compute element-wise compliance sensitivities dC/drho.

Parameters:

U (cp.ndarray) – Displacement vector from FEA solve

Returns:

dk – Element sensitivities (compliance derivative per element)

Return type:

cp.ndarray

Notes

  • Computes: dk[e] = -U[e]^T @ K_e @ U[e]

  • Used for compliance minimization gradient

  • Negative sign convention for minimization

construct(rho)[source]

Construct CSR stiffness matrix representation.

Parameters:

rho (cupy.ndarray) – Element density variables, shape (n_elements,)

Returns:

Sparse CSR stiffness matrix K(rho), shape (n_dof, n_dof)

Return type:

cupy.sparse.csr_matrix

Notes

  • Assembles full sparse matrix structure (first call) or updates values (subsequent)

  • Uses matrix-matrix product with identity to extract sparsity pattern

  • Caches sparsity pattern (ptr) for efficient updates

  • Sets has_been_constructed flag after first call

  • More memory-intensive than matrix-free operations but needed for some solvers

dot(rhs)[source]

Matrix-vector or matrix-matrix product K @ rhs.

Parameters:

rhs (cupy.ndarray or cupy.sparse.csr_matrix) – Right-hand side vector or matrix - Vector: shape (n_dof,) - Matrix: shape (n_dof, n_cols) or sparse CSR matrix

Returns:

Result K @ rhs, same type and shape as rhs (except columns)

Return type:

cupy.ndarray or cupy.sparse.csr_matrix

Raises:

Notes

  • Uses cached self.rho if available

  • Automatically detects vector vs matrix input

  • Converts sparse matrices to CSR format if needed

  • Main interface for matrix operations

class pyFANTOM.stiffness.CUDA._FEA.UniformStiffnessKernel(mesh: CuGeneralMesh)[source]

Bases: StiffnessKernel

CUDA-accelerated stiffness kernel for uniform unstructured meshes.

GPU version of CPU UniformStiffnessKernel. Handles unstructured meshes where all elements have the same number of nodes and identical geometry, using element-specific stiffness matrices. More general than StructuredStiffnessKernel but less efficient. Identical API to CPU version but operates entirely on GPU memory.

Parameters:

mesh (CuGeneralMesh) – CUDA general mesh with uniform element topology (mesh.is_uniform must be True)

Ks

Element stiffness matrices, shape (n_elements, dof_per_element, dof_per_element) (on GPU)

Type:

cupy.ndarray

shape

Global stiffness matrix dimensions (n_dof, n_dof)

Type:

tuple

constraints

Boolean array for constrained DOFs (on GPU)

Type:

cupy.ndarray (bool)

has_cons

True if boundary conditions have been applied

Type:

bool

Same interface as StructuredStiffnessKernel (set_rho, dot, construct, diagonal, etc.)

Notes

  • Requires mesh.is_uniform == True (all elements same size)

  • Stores per-element stiffness matrices (more memory than StructuredStiffnessKernel)

  • Use for unstructured uniform meshes (e.g., triangular or tetrahedral)

  • For structured grids, prefer StructuredStiffnessKernel for better performance

  • All arrays stored as CuPy arrays on GPU

  • All operations on GPU using CuPy

  • Use with CUDA solvers (CG, MultiGrid) for best performance

  • Memory limited by GPU VRAM

Examples

>>> from pyFANTOM.CUDA import GeneralMesh, UniformStiffnessKernel
>>> import numpy as np
>>> # Triangular mesh
>>> nodes = np.array([[0,0], [1,0], [0,1], [1,1]])
>>> elements = np.array([[0,1,2], [1,3,2]])
>>> mesh = GeneralMesh(nodes, elements)
>>> kernel = UniformStiffnessKernel(mesh=mesh)
__init__(mesh: CuGeneralMesh)[source]

Initialize base stiffness kernel.

Sets up attributes for density storage and matrix operation aliases. Subclasses should call super().__init__() and set mesh-specific attributes.

diagonal(rho=None)[source]

Extract diagonal of stiffness matrix.

Parameters:

rho (cp.ndarray, optional) – Design variables on GPU. If None, uses self.rho (must be set via set_rho())

Returns:

Diagonal entries on GPU, shape (n_dof,)

Return type:

cp.ndarray

Raises:

ValueError – If rho is None and has_rho is False

Notes

Useful for Jacobi preconditioning or diagonal scaling in iterative solvers.

set_constraints(constraints)[source]

Set Dirichlet boundary conditions (replaces existing constraints).

Parameters:

constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)

Notes

Clears all existing constraints and sets only the specified DOFs. Use add_constraints() to add constraints incrementally.

add_constraints(constraints)[source]

Add Dirichlet boundary conditions (accumulates with existing constraints).

Parameters:

constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)

Notes

Adds to existing constraints. Use set_constraints() to replace all constraints.

process_grad(U)[source]

Compute element-wise compliance sensitivities dC/drho.

Parameters:

U (cp.ndarray) – Displacement vector from FEA solve on GPU, shape (n_dof,)

Returns:

dk – Element sensitivities on GPU, shape (n_elements,)

Return type:

cp.ndarray

Notes

Computes -U^T @ dK/drho @ U for each element, used in compliance sensitivity analysis. All operations performed on GPU.

construct(rho)[source]

Build explicit CSR sparse matrix representation on GPU.

Parameters:

rho (cp.ndarray) – Design variables on GPU, shape (n_elements,)

Returns:

Sparse stiffness matrix in CSR format on GPU

Return type:

cupyx.scipy.sparse.csr_matrix

Notes

  • Memory-intensive: O(nnz) storage on GPU

  • Required for direct solvers (SPSOLVE)

  • Reuses sparsity pattern on subsequent calls (faster)

dot(rhs)[source]

Matrix-vector or matrix-matrix product K @ rhs.

Parameters:

rhs (cupy.ndarray or cupy.sparse.csr_matrix) – Right-hand side vector or matrix - Vector: shape (n_dof,) - Matrix: shape (n_dof, n_cols) or sparse CSR matrix

Returns:

Result K @ rhs, same type and shape as rhs (except columns)

Return type:

cupy.ndarray or cupy.sparse.csr_matrix

Raises:

Notes

  • Uses cached self.rho if available

  • Automatically detects vector vs matrix input

  • Converts sparse matrices to CSR format if needed

  • Main interface for matrix operations

class pyFANTOM.solvers.CUDA._solvers.CG(kernel: StiffnessKernel, maxiter=1000, tol=1e-05, matrix_free=None)[source]

Bases: Solver

CUDA-accelerated Conjugate Gradient iterative solver.

GPU version of the CG solver using CuPy. Efficiently solves large sparse symmetric positive definite linear systems Kx=b on GPU with optional matrix-free operation.

Parameters:
  • kernel (CuStiffnessKernel) – CUDA stiffness kernel (matrix operator)

  • maxiter (int, optional) – Maximum number of CG iterations (default: 1000)

  • tol (float, optional) – Convergence tolerance for relative residual (default: 1e-5)

  • matrix_free (bool, optional) – Whether to use matrix-free mode. If None, automatically determined: - True for StructuredStiffnessKernel (recommended) - False for General/UniformStiffnessKernel (faster)

Notes

  • Matrix-free mode: Lower memory, slower per-iteration

  • Assembled matrix mode: Higher memory, faster per-iteration

  • For structured meshes, matrix-free is preferred

  • For general meshes, assembled matrix is faster

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel, CG
>>> mesh = StructuredMesh2D(nx=256, ny=256, lx=1.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = CG(kernel=kernel, maxiter=1000, tol=1e-6)
__init__(kernel: StiffnessKernel, maxiter=1000, tol=1e-05, matrix_free=None)[source]
solve(rhs, rho=None, use_last=True)[source]

Solve linear system K(rho) @ U = rhs on GPU.

Parameters:
  • rhs (cp.ndarray) – Right-hand side force vector on GPU, shape (n_dof,)

  • rho (cp.ndarray, optional) – Design variables (densities) on GPU, shape (n_elements,)

  • use_last (bool, optional) – Use previous solution as initial guess (warm-starting), default: True

Returns:

  • U (cp.ndarray) – Solution vector on GPU, shape (n_dof,)

  • residual (float) – Relative residual: ||K@U - rhs|| / ||rhs||

Raises:

ValueError – If rho is not provided and kernel.has_rho is False

Notes

  • All operations performed on GPU using CuPy

  • Warm-starting (use_last=True) accelerates convergence

  • Matrix-free mode uses kernel.dot() directly

  • Assembled mode constructs explicit CSR matrix first

class pyFANTOM.solvers.CUDA._solvers.GMRES(kernel: StiffnessKernel, maxiter=1000, tol=1e-05, matrix_free=None)[source]

Bases: Solver

CUDA-accelerated Generalized Minimal Residual iterative solver.

GPU version of GMRES solver using CuPy. Solves non-symmetric linear systems Kx=b on GPU. More general than CG but typically slower and more memory-intensive.

Parameters:
  • kernel (CuStiffnessKernel) – CUDA stiffness kernel (matrix operator)

  • maxiter (int, optional) – Maximum number of GMRES iterations (default: 1000)

  • tol (float, optional) – Convergence tolerance for relative residual (default: 1e-5)

  • matrix_free (bool, optional) – Whether to use matrix-free mode. If None, automatically determined: - True for StructuredStiffnessKernel - False for General/UniformStiffnessKernel (recommended)

Notes

  • GMRES handles non-symmetric systems (rare in FEA)

  • CG is preferred for symmetric positive definite systems

  • Higher memory usage than CG due to Krylov subspace storage

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel, GMRES
>>> solver = GMRES(kernel=kernel, maxiter=1000, tol=1e-6)
__init__(kernel: StiffnessKernel, maxiter=1000, tol=1e-05, matrix_free=None)[source]
solve(rhs, rho=None, use_last=True)[source]

Solve linear system K(rho) @ U = rhs on GPU using GMRES.

Parameters:
  • rhs (cp.ndarray) – Right-hand side force vector on GPU, shape (n_dof,)

  • rho (cp.ndarray, optional) – Design variables (densities) on GPU, shape (n_elements,)

  • use_last (bool, optional) – Use previous solution as initial guess (warm-starting), default: True

Returns:

  • U (cp.ndarray) – Solution vector on GPU, shape (n_dof,)

  • residual (float) – Relative residual: ||K@U - rhs|| / ||rhs||

Raises:

ValueError – If rho is not provided and kernel.has_rho is False

Notes

  • All operations performed on GPU using CuPy

  • Warm-starting (use_last=True) accelerates convergence

  • Matrix-free mode uses kernel.dot() directly

  • Assembled mode constructs explicit CSR matrix first

  • GMRES handles non-symmetric systems (rare in FEA)

  • CG is preferred for symmetric positive definite systems

class pyFANTOM.solvers.CUDA._solvers.SPSOLVE(kernel: StiffnessKernel)[source]

Bases: Solver

CUDA sparse direct solver using CuPy’s spsolve.

GPU-accelerated direct sparse LU factorization solver. Fast for small to medium problems but memory-intensive. Limited to < 3M DOF due to GPU memory constraints.

Parameters:

kernel (CuStiffnessKernel) – CUDA stiffness kernel (must construct explicit matrix)

Raises:

ValueError – If kernel.shape[0] > 3e6 (3 million DOF limit)

Notes

  • Direct solver: No iterations, exact solution (within numerical precision)

  • High memory usage: Stores factorization on GPU

  • Use iterative solvers (CG, MultiGrid) for large problems

  • Automatically handles constrained DOF

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel, SPSOLVE
>>> mesh = StructuredMesh2D(nx=64, ny=64, lx=1.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = SPSOLVE(kernel=kernel)  # OK for small problems
__init__(kernel: StiffnessKernel)[source]
solve(rhs, rho=None, **kwargs)[source]

Solve the linear system Kx = rhs using sparse direct solver.

Parameters:
  • rhs (cp.ndarray) – Right-hand side vector

  • rho (cp.ndarray, optional) – Design variable densities. If None, uses kernel’s stored rho

  • **kwargs – Additional keyword arguments (unused, for API compatibility)

Returns:

  • U (cp.ndarray) – Solution vector

  • residual (float) – Relative residual norm ||rhs - K@U|| / ||rhs||

Notes

  • Direct solver: No warm start needed

  • Constructs and factorizes K every call (expensive)

  • Consider iterative solvers for repeated solves in optimization

class pyFANTOM.solvers.CUDA._solvers.MultiGrid(mesh: CuStructuredMesh2D | CuStructuredMesh3D, kernel: StiffnessKernel, maxiter=1000, tol=1e-05, n_smooth=3, omega=0.5, n_level=3, cycle='W', w_level=1, coarse_solver='cholmod', matrix_free=False, low_level_tol=1e-08, low_level_maxiter=5000, min_omega=0.4, omega_boost=1.06)[source]

Bases: Solver

CUDA-accelerated Geometric Multigrid solver for structured meshes.

GPU implementation of geometric multigrid with Jacobi smoothing and coarse grid correction. Most memory-efficient solver for large 3D problems on GPU. Uses restriction/prolongation operators on nested grids for O(n) complexity per iteration.

Parameters:
  • mesh (StructuredMesh2D or StructuredMesh3D) – CUDA structured mesh (required for geometric coarsening)

  • kernel (CuStiffnessKernel) – CUDA stiffness kernel (must be StructuredStiffnessKernel)

  • maxiter (int, optional) – Maximum outer iterations (preconditioned CG) (default: 1000)

  • tol (float, optional) – Convergence tolerance for relative residual (default: 1e-5)

  • n_smooth (int, optional) – Number of Jacobi smoothing iterations per level (default: 3)

  • omega (float, optional) – Jacobi relaxation parameter (default: 0.5)

  • n_level (int, optional) – Number of multigrid levels (default: 3)

  • cycle (str, optional) – Multigrid cycle type: ‘V’ or ‘W’ (default: ‘W’)

  • w_level (int or list, optional) – Levels to apply W-cycle at (default: 1)

  • coarse_solver (str, optional) – Coarse grid solver: ‘cholmod’, ‘cg’, ‘gmres’, ‘splu’, ‘spsolve’ (default: ‘cholmod’)

  • matrix_free (bool, optional) – Whether to use matrix-free operators (default: False, not yet implemented)

  • low_level_tol (float, optional) – Tolerance for coarse grid iterative solver (default: 1e-8)

  • low_level_maxiter (int, optional) – Max iterations for coarse grid iterative solver (default: 5000)

  • min_omega (float, optional) – Minimum omega for adaptive relaxation (default: 0.4)

  • omega_boost (float, optional) – Omega boost factor per level (default: 1.06)

Notes

  • Only works with structured meshes (uniform grid coarsening)

  • Memory efficient: No explicit matrix storage

  • Preconditioned CG with multigrid preconditioner

  • Adaptive omega based on residual

  • CHOLMOD coarse solver requires CPU transfer (fast for small coarse grids)

Raises:

ValueError – If matrix_free=True with splu/spsolve coarse solver If coarse_solver not in [‘cg’, ‘gmres’, ‘splu’, ‘spsolve’, ‘cholmod’]

Examples

>>> from pyFANTOM.CUDA import StructuredMesh3D, StructuredStiffnessKernel, MultiGrid
>>> mesh = StructuredMesh3D(nx=128, ny=128, nz=128, lx=1.0, ly=1.0, lz=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = MultiGrid(mesh=mesh, kernel=kernel, n_level=4, cycle='W')
__init__(mesh: CuStructuredMesh2D | CuStructuredMesh3D, kernel: StiffnessKernel, maxiter=1000, tol=1e-05, n_smooth=3, omega=0.5, n_level=3, cycle='W', w_level=1, coarse_solver='cholmod', matrix_free=False, low_level_tol=1e-08, low_level_maxiter=5000, min_omega=0.4, omega_boost=1.06)[source]
reset()[source]

Reset cached multigrid hierarchy (forces rebuild on next solve).

solve(rhs, rho=None, use_last=True)[source]

Solve linear system K(rho) @ U = rhs on GPU using multigrid-preconditioned CG.

Parameters:
  • rhs (cp.ndarray) – Right-hand side force vector on GPU, shape (n_dof,)

  • rho (cp.ndarray, optional) – Design variables (densities) on GPU, shape (n_elements,)

  • use_last (bool, optional) – Use previous solution as initial guess (warm-starting), default: True

Returns:

  • U (cp.ndarray) – Solution vector on GPU, shape (n_dof,)

  • residual (float) – Relative residual: ||K@U - rhs|| / ||rhs||

Raises:

ValueError – If rho is not provided and kernel.has_rho is False

Notes

  • All operations performed on GPU using CuPy

  • First call builds multigrid hierarchy (expensive, ~1-5 seconds)

  • Subsequent calls reuse hierarchy (fast, ~0.1-1 second)

  • Uses preconditioned CG with multigrid V/W-cycle preconditioner

  • Adaptive omega adjusts relaxation based on residual reduction

  • Frees GPU memory after solve to prevent accumulation

  • Most memory-efficient solver for large 3D problems (>10M DOF)

class pyFANTOM.Optimizers.CUDA.MMA.MMA(problem: Problem, sub_tol=1e-07, sub_maxiter=100, change_tol=0.0001, fun_tol=1e-06, move=0.5, timer=False)[source]

Bases: Optimizer

CUDA-accelerated Method of Moving Asymptotes (MMA) optimizer.

GPU version of MMA using CuPy arrays. Identical API to CPU version but operates on GPU memory for maximum performance in large-scale topology optimization.

Parameters:
  • problem (Problem) – Optimization problem (e.g., MinimumCompliance) with CUDA backend

  • sub_tol (float, optional) – Sub-problem convergence tolerance (default: 1e-7)

  • sub_maxiter (int, optional) – Sub-problem max iterations (default: 100)

  • change_tol (float, optional) – Design variable change convergence tolerance (default: 1e-4)

  • fun_tol (float, optional) – Objective function change tolerance (default: 1e-6)

  • move (float, optional) – Move limit for design variables, range [0,1] (default: 0.5)

  • timer (bool, optional) – Return iteration time if True (default: False)

Notes

  • All arrays stored as CuPy arrays on GPU

  • 5-10x faster than CPU for large problems (>1M DOF)

  • Requires CUDA-capable GPU and CuPy

  • Best for: General constraints, multi-material, multi-constraint problems

Examples

>>> from pyFANTOM.CUDA import MMA, MinimumCompliance
>>> optimizer = MMA(problem=problem, sub_tol=1e-7, fun_tol=1e-6)
>>> for i in range(100):
>>>     optimizer.iter()
>>>     if optimizer.converged():
>>>         break
__init__(problem: Problem, sub_tol=1e-07, sub_maxiter=100, change_tol=0.0001, fun_tol=1e-06, move=0.5, timer=False)[source]

Create an optimizer bound to a problem.

Parameters - problem: instance of Problem providing objective and gradients.

iter()[source]

Perform one MMA optimization iteration on GPU.

Updates design variables by solving a convex sub-problem approximation of the original problem. Uses moving asymptotes to control step size. All operations performed on GPU using CuPy.

Returns:

If timer=True, returns iteration time in seconds

Return type:

float, optional

Notes

  • Solves convex sub-problem using mmasub() on GPU

  • Updates moving asymptotes (low, upp) for next iteration

  • Tracks design variable change and objective change

  • Handles NaN values by clipping to bounds

converged(*args, **kwargs)[source]

Check if optimizer has converged.

Returns:

True if convergence criteria are met: - Problem penalty continuation is complete (is_terminal() == True) - Design variable change <= change_tol - Objective function change <= fun_tol

Return type:

bool

Notes

Convergence requires both design change and objective change to be below tolerances. Also checks that penalty continuation (if used) has finished.

logs()[source]

Return diagnostic information for current iteration.

Returns:

Dictionary with keys: - ‘objective’: Current objective function value - ‘variable change’: L2 norm of design variable change - ‘function change’: Relative objective function change - Additional keys from problem.logs() (e.g., ‘iteration’, ‘residual’)

Return type:

dict

Notes

Used for monitoring optimization progress and convergence.

class pyFANTOM.Optimizers.CUDA.OC.OC(problem: Problem, move: float = 0.2, change_tol=0.0001, fun_tol=1e-06, timer=False)[source]

Bases: Optimizer

CUDA-accelerated Optimality Criteria (OC) optimizer.

GPU version of OC using CuPy arrays. Fast heuristic optimizer specifically designed for topology optimization with independent volume constraints. Identical API to CPU version.

Parameters:
  • problem (Problem) – Optimization problem (must have independent constraints) with CUDA backend

  • move (float, optional) – Move limit for design variables (default: 0.2)

  • change_tol (float, optional) – Convergence tolerance for design change (default: 1e-4)

  • fun_tol (float, optional) – Convergence tolerance for objective change (default: 1e-6)

  • timer (bool, optional) – Return iteration time (default: False)

Notes

  • All arrays stored as CuPy arrays on GPU

  • 2-3x faster than CPU per iteration

  • Best for: Classic minimum compliance with volume constraint

  • Limited scope: Only works with independent constraints

  • Requires CUDA-capable GPU and CuPy

Examples

>>> from pyFANTOM.CUDA import OC
>>> optimizer = OC(problem=problem, move=0.2)
>>> for i in range(200):
>>>     optimizer.iter()
>>>     if optimizer.converged():
>>>         break
__init__(problem: Problem, move: float = 0.2, change_tol=0.0001, fun_tol=1e-06, timer=False)[source]

Create an optimizer bound to a problem.

Parameters - problem: instance of Problem providing objective and gradients.

iter()[source]

Perform one OC optimization iteration on GPU.

Updates design variables using optimality criteria update formula with bisection to find Lagrange multiplier satisfying volume constraint. All operations performed on GPU using CuPy.

Returns:

If timer=True, returns iteration time in seconds

Return type:

float, optional

Notes

  • Computes OC parameter: ocP = x * sqrt(-df/dg)

  • Uses bisection to find Lagrange multiplier lambda

  • Updates: x_new = clip(xL, xU, ocP/lambda)

  • Enforces move limits: xL = x - move, xU = x + move

converged(*args, **kwargs)[source]

Check if optimizer has converged.

Returns:

True if convergence criteria are met: - Problem penalty continuation is complete (is_terminal() == True) - Design variable change <= change_tol - Objective function change <= fun_tol

Return type:

bool

Notes

Convergence requires both design change and objective change to be below tolerances. Also checks that penalty continuation (if used) has finished.

logs()[source]

Return diagnostic information for current iteration.

Returns:

Dictionary with keys: - ‘objective’: Current objective function value - ‘variable change’: L2 norm of design variable change - ‘function change’: Relative objective function change - Additional keys from problem.logs() (e.g., ‘iteration’, ‘residual’)

Return type:

dict

Notes

Used for monitoring optimization progress and convergence.

class pyFANTOM.Optimizers.CUDA.PGD.PGD(problem: Problem, change_tol=0.0001, fun_tol=1e-06, maxiter_N=50, tol_B=1e-08, tol_N=1e-06, C=1000000000000.0, fall_back_move=0.2, alpha_max=100.0, relaxation=1.0, warmup_iter=50, timer=False, conjugate_directions=True)[source]

Bases: Optimizer

CUDA-accelerated Projected Gradient Descent optimizer.

GPU version of PGD using CuPy arrays. Fast first-order optimizer using BFGS-based step size adaptation and projection onto feasible set. Identical API to CPU version.

Parameters:
  • problem (Problem) – Optimization problem with CUDA backend

  • change_tol (float, optional) – Design variable change tolerance (default: 1e-4)

  • fun_tol (float, optional) – Objective function change tolerance (default: 1e-6)

  • maxiter_N (int, optional) – Max iterations for Newton’s method in projection (default: 50)

  • tol_B (float, optional) – Bisection tolerance (default: 1e-8)

  • tol_N (float, optional) – Newton solver tolerance (default: 1e-6)

  • C (float, optional) – Large constant for penalty (default: 1e12)

  • fall_back_move (float, optional) – Fallback step size when BFGS fails (default: 0.2)

  • alpha_max (float, optional) – Maximum step size (default: 1e2)

  • relaxation (float, optional) – Step size relaxation parameter (default: 1.0)

  • warmup_iter (int, optional) – Iterations before enforcing strict feasibility (default: 50)

  • timer (bool, optional) – Return iteration time (default: False)

  • conjugate_directions (bool, optional) – Use conjugate gradient directions (default: True)

Notes

  • All arrays stored as CuPy arrays on GPU

  • Faster than CPU: ~30-50% faster per iteration

  • Best for: Simple bounds, single constraint problems

  • Less robust: May struggle with multiple nonlinear constraints

  • Requires CUDA-capable GPU and CuPy

Examples

>>> from pyFANTOM.CUDA import PGD
>>> optimizer = PGD(problem=problem, conjugate_directions=True)
>>> for i in range(100):
>>>     optimizer.iter()
>>>     if optimizer.converged():
>>>         break
__init__(problem: Problem, change_tol=0.0001, fun_tol=1e-06, maxiter_N=50, tol_B=1e-08, tol_N=1e-06, C=1000000000000.0, fall_back_move=0.2, alpha_max=100.0, relaxation=1.0, warmup_iter=50, timer=False, conjugate_directions=True)[source]

Create an optimizer bound to a problem.

Parameters - problem: instance of Problem providing objective and gradients.

alpha(desvars, df)[source]
project_to_feasible(desvars_new, desvars, dg)[source]
iter()[source]

Perform one PGD optimization iteration on GPU.

Updates design variables using projected gradient descent with adaptive step size (BFGS-based) and projection onto feasible set. All operations performed on GPU using CuPy.

Returns:

If timer=True, returns iteration time in seconds

Return type:

float, optional

Notes

  • Computes step size using BFGS approximation (alpha)

  • Uses conjugate gradient directions if enabled

  • Projects update onto feasible set (satisfies constraints)

  • For independent constraints: uses bisection

  • For coupled constraints: uses Newton’s method with Wolfe line search

converged(*args, **kwargs)[source]

Check if optimizer has converged.

Returns:

True if convergence criteria are met: - Problem penalty continuation is complete (is_terminal() == True) - Design variable change <= change_tol - Objective function change <= fun_tol

Return type:

bool

Notes

Convergence requires both design change and objective change to be below tolerances. Also checks that penalty continuation (if used) has finished.

logs()[source]

Return diagnostic information for current iteration.

Returns:

Dictionary with keys: - ‘objective’: Current objective function value - ‘variable change’: L2 norm of design variable change - ‘function change’: Relative objective function change - Additional keys from problem.logs() (e.g., ‘iteration’, ‘residual’)

Return type:

dict

Notes

Used for monitoring optimization progress and convergence.

Wolfe conditions line search for Newton step

class pyFANTOM.Problem.CUDA.MinimumCompliance.MinimumCompliance(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Bases: Problem

CUDA-accelerated minimum compliance topology optimization problem.

GPU version of MinimumCompliance using CuPy arrays. Identical API to CPU version but operates on GPU memory for maximum performance in large-scale optimization.

Parameters:
  • FE (FiniteElement) – CUDA finite element analysis engine with mesh, kernel, and solver

  • filter (StructuredFilter2D, StructuredFilter3D, or GeneralFilter) – CUDA density filter for ensuring minimum feature sizes

  • E_mul (list of float, optional) – Young’s modulus multipliers for each material (default: [1.0] for single material)

  • void (float, optional) – Minimum density to avoid singularity (default: 1e-6)

  • penalty (float, optional) – SIMP penalization exponent (default: 3.0). Higher = more binary designs

  • volume_fraction (list of float, optional) – Volume fraction constraint for each material (default: [0.25])

  • penalty_schedule (callable, optional) – Function(p, iteration) for penalty continuation. If None, uses constant penalty

  • heavyside (bool, optional) – Apply Heaviside projection for sharper 0-1 designs (default: True)

  • beta (float or callable, optional) – Heaviside projection sharpness parameter. Can be a float (default: 2) or a callable function of iteration: beta(iteration) -> float. Enables beta continuation for gradual Heaviside sharpening during optimization.

  • eta (float, optional) – Heaviside projection threshold (default: 0.5)

Notes

  • All arrays stored as CuPy arrays on GPU

  • 5-10x faster than CPU for large problems (>1M DOF)

  • Requires CUDA-capable GPU and CuPy

  • Identical SIMP formulation and Heaviside projection as CPU version

Examples

>>> from pyFANTOM.CUDA import *
>>> mesh = StructuredMesh2D(nx=256, ny=128, lx=2.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = CG(kernel=kernel)
>>> FE = FiniteElement(mesh=mesh, kernel=kernel, solver=solver)
>>> filter = StructuredFilter2D(mesh=mesh, r_min=1.5)
>>> problem = MinimumCompliance(FE=FE, filter=filter, penalty=3.0, volume_fraction=[0.3])
__init__(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Initialize problem state.

Subclasses may accept finite-element handlers, filters, and other configuration parameters.

N()[source]

Return the number of design variables.

Returns:

Total number of design variables (n_elements * n_materials)

Return type:

int

m()[source]

Return the number of constraints.

Returns:

Number of volume constraints (1 for single material, n_materials for multi-material)

Return type:

int

is_independent()[source]

Check if constraints are independent (required for OC optimizer).

Returns:

True if constraints are independent (always True for MinimumCompliance)

Return type:

bool

constraint_map()[source]

Return mapping of constraints to design variables.

Returns:

  • Single material: 1 (scalar)

  • Multi-material: array of shape (n_materials, n_vars) with 1s indicating which design variables belong to each material constraint

Return type:

int or cp.ndarray

Notes

Used by optimizers to identify which design variables affect each constraint. For multi-material, constraint i affects variables [i*n_elements:(i+1)*n_elements].

bounds()[source]

Return bounds for design variables.

Returns:

(lower_bound, upper_bound) where both are 0.0 and 1.0 respectively

Return type:

tuple

Notes

Design variables represent normalized densities in [0, 1].

visualize_problem(**kwargs)[source]

Visualize problem setup (mesh, BCs, loads).

Parameters:

**kwargs – Arguments passed to FE.visualize_problem()

Returns:

Plot object

Return type:

matplotlib.axes.Axes or k3d.Plot

Notes

GPU arrays are automatically transferred to CPU for visualization.

visualize_solution(**kwargs)[source]

Visualize optimized design (density distribution).

Parameters:

**kwargs – Arguments passed to FE.visualize_density()

Returns:

Plot object

Return type:

matplotlib.axes.Axes or k3d.Plot

Notes

Shows current design variables as density field. For multi-material problems, displays combined material distribution. GPU arrays transferred to CPU for visualization.

init_desvars()[source]

Initialize design variables to uniform density at volume fraction.

Sets all design variables to the volume fraction constraint value and performs initial FEA solve to compute objective and constraints.

Notes

  • Single material: all variables set to volume_fraction

  • Multi-material: all variables set to min(volume_fraction) for each material

  • Resets iteration counter to 0

  • Triggers _compute() to evaluate objective and constraints

  • All operations performed on GPU

set_desvars(desvars: cupy.ndarray)[source]

Set design variables and recompute objective/constraints.

Parameters:

desvars (cp.ndarray) – Design variables on GPU, shape (n_vars,). Values should be in [0, 1]

Raises:

ValueError – If desvars shape doesn’t match num_vars

Notes

  • Triggers FEA solve and sensitivity computation on GPU

  • Increments iteration counter

  • Updates cached values: _f, _g, _nabla_f, _nabla_g

get_desvars()[source]

Get current design variables.

Returns:

Current design variables on GPU, shape (n_vars,)

Return type:

cp.ndarray

Notes

Returns raw (unfiltered) design variables. For filtered densities, access via problem.filter.dot(problem.get_desvars()).

penalize(rho: cupy.ndarray)[source]
penalize_grad(rho: cupy.ndarray)[source]
f(rho: cupy.ndarray | None = None)[source]

Compute objective function value (compliance).

Parameters:

rho (cp.ndarray, optional) – Design variables on GPU for linearization. If None, returns cached value

Returns:

Compliance value (F^T @ U). Lower is better (stiffer structure).

Return type:

float

Notes

  • If rho provided: returns linearized approximation f(x) + df/dx @ (rho - x)

  • If rho is None: returns cached value from last set_desvars() call

  • Compliance = F^T @ U = U^T @ K @ U (strain energy)

  • All operations performed on GPU

nabla_f(rho: cupy.ndarray | None = None)[source]

Compute objective function gradient (compliance sensitivities).

Parameters:

rho (cp.ndarray, optional) – Unused (for interface compatibility)

Returns:

Gradient of compliance w.r.t. design variables on GPU, shape (n_vars,)

Return type:

cp.ndarray

Notes

  • Uses adjoint method: dC/drho = -U^T @ dK/drho @ U

  • Includes filter adjoint: sens_raw = H^T @ sens_filtered

  • Negative gradient means increasing density reduces compliance (good)

  • All operations performed on GPU

g(rho=None)[source]

Compute constraint values (volume fraction violations).

Parameters:

rho (cp.ndarray, optional) – Design variables on GPU. If None, uses current desvars

Returns:

Constraint violations, shape (n_constraints,). Negative = satisfied. g[i] = (volume_fraction[i] - actual_volume_fraction[i])

Return type:

cp.ndarray

Notes

  • Constraint satisfied when g <= 0

  • For single material: returns scalar

  • For multi-material: returns array with one constraint per material

  • All operations performed on GPU

nabla_g()[source]

Compute constraint gradients (volume fraction sensitivities).

Returns:

Gradient of constraints w.r.t. design variables on GPU. - Single material: shape (n_vars,) - Multi-material: shape (n_materials, n_vars)

Return type:

cp.ndarray

Notes

  • Each row is d(volume_fraction[i])/drho

  • For uniform elements, gradient is constant: 1/volume_total per element

  • Used by optimizers to enforce volume constraints

ill_conditioned()[source]

Check if FEA system is ill-conditioned.

Returns:

True if residual >= 1e-2 (indicates poor solver convergence)

Return type:

bool

Notes

  • Residual > 1e-2 suggests numerical issues (check void parameter, penalty)

  • May indicate near-singular stiffness matrix (too many void elements)

  • Consider increasing void parameter or reducing penalty

is_terminal()[source]

Check if penalty continuation has reached final value.

Returns:

True if penalty schedule is complete or not used

Return type:

bool

Notes

  • Used by optimizers to determine if continuation is finished

  • If penalty_schedule is None, always returns True

  • If penalty_schedule exists, checks if current iteration has reached final penalty

logs()[source]

Return diagnostic information for current iteration.

Returns:

Dictionary with keys: - ‘iteration’: Current iteration number - ‘residual’: FEA solver residual (||K@U - F|| / ||F||)

Return type:

dict

Notes

Used by optimizers to track convergence and diagnose issues.

FEA(thresshold: bool = True)[source]
class pyFANTOM.Problem.CUDA.ComplianceConstrainedMinimumVolume.ComplianceConstrainedMinimumVolume(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E: float = 1.0, void: float = 1e-06, penalty: float = 3.0, compliance_limit: float = 0.25, penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Bases: Problem

Minimum volume topology optimization with compliance constraint (CUDA).

GPU-accelerated version of ComplianceConstrainedMinimumVolume. Inverse formulation of MinimumCompliance: minimize material usage while maintaining structure stiffness above threshold. All computations performed on GPU using CuPy.

Parameters:
  • FE (FiniteElement) – CUDA finite element analysis engine

  • filter (StructuredFilter2D, StructuredFilter3D, or GeneralFilter) – CUDA density filter

  • E (float, optional) – Young’s modulus (default: 1.0)

  • void (float, optional) – Minimum density to avoid singularity (default: 1e-6)

  • penalty (float, optional) – SIMP penalization exponent (default: 3.0)

  • compliance_limit (float, optional) – Maximum allowable compliance (normalized by initial compliance) (default: 0.25)

  • penalty_schedule (callable, optional) – Penalty continuation function(p, iteration)

  • heavyside (bool, optional) – Apply Heaviside projection (default: True)

  • beta (float or callable, optional) – Heaviside projection sharpness parameter. Can be a float (default: 2) or a callable function of iteration: beta(iteration) -> float. Enables beta continuation for gradual Heaviside sharpening during optimization.

  • eta (float, optional) – Heaviside threshold (default: 0.5)

Notes

  • Objective: Minimize volume

  • Constraint: Compliance ≤ compliance_limit * initial_compliance

  • Use case: Lightweight design, material cost minimization

  • Comparison: More challenging than MinimumCompliance (compliance constraint is nonlinear)

  • Typical compliance_limit: 0.2-0.5 of initial design

  • GPU Acceleration: All arrays stored as CuPy arrays on GPU

  • Performance: 5-10x faster than CPU for large problems

Examples

>>> from pyFANTOM.CUDA import *
>>> problem = ComplianceConstrainedMinimumVolume(
>>>     FE=FE, filter=filter,
>>>     compliance_limit=0.3,  # 30% of initial compliance
>>>     penalty=3.0
>>> )
__init__(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E: float = 1.0, void: float = 1e-06, penalty: float = 3.0, compliance_limit: float = 0.25, penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Initialize problem state.

Subclasses may accept finite-element handlers, filters, and other configuration parameters.

N()[source]

Return the number of design variables.

Returns:

Total number of design variables (n_elements)

Return type:

int

m()[source]

Return the number of constraints.

Returns:

Number of constraints (1: compliance constraint)

Return type:

int

is_independent()[source]

Check if constraints are independent (required for OC optimizer).

Returns:

True if constraints are independent (always True for ComplianceConstrainedMinimumVolume)

Return type:

bool

constraint_map()[source]

Return mapping of constraints to design variables.

Returns:

Scalar value 1 indicating all design variables affect the compliance constraint

Return type:

int

Notes

Used by optimizers to identify which design variables affect the constraint. For this problem, all variables affect compliance.

bounds()[source]

Return bounds for design variables.

Returns:

(lower_bound, upper_bound) where both are 0.0 and 1.0 respectively

Return type:

tuple

Notes

Design variables represent normalized densities in [0, 1].

visualize_problem(**kwargs)[source]

Visualize problem setup (mesh, BCs, loads).

Parameters:

**kwargs – Arguments passed to FE.visualize_problem()

Returns:

Plot object

Return type:

matplotlib.axes.Axes or k3d.Plot

visualize_solution(**kwargs)[source]

Visualize optimized design (density distribution).

Parameters:

**kwargs – Arguments passed to FE.visualize_density()

Returns:

Plot object

Return type:

matplotlib.axes.Axes or k3d.Plot

Notes

Shows current design variables as density field.

init_desvars()[source]

Initialize design variables to full density (volume = 1.0).

Sets all design variables to 1.0 (full material) and performs initial FEA solve to compute objective and constraints. This ensures initial design satisfies compliance constraint.

Notes

  • All variables set to 1.0 (full material)

  • Resets iteration counter to 0

  • Triggers _compute() to evaluate objective and constraints

set_desvars(desvars: cupy.ndarray)[source]

Set design variables and recompute objective/constraints.

Parameters:

desvars (cupy.ndarray) – Design variables, shape (n_vars,). Values should be in [0, 1]

Raises:

ValueError – If desvars shape doesn’t match num_vars

Notes

  • Triggers FEA solve and sensitivity computation

  • Increments iteration counter

  • Updates cached values: _f, _g, _nabla_f, _nabla_g

get_desvars()[source]

Get current design variables.

Returns:

Current design variables, shape (n_vars,)

Return type:

cupy.ndarray

Notes

Returns raw (unfiltered) design variables. For filtered densities, access via problem.filter.dot(problem.get_desvars()).

penalize(rho: cupy.ndarray)[source]

Apply SIMP penalization and Heaviside projection.

Parameters:

rho (cupy.ndarray) – Filtered densities, shape (n_elements,)

Returns:

Penalized Young’s moduli, shape (n_elements,)

Return type:

cupy.ndarray

Notes

  • Applies Heaviside projection if enabled

  • Applies SIMP: E = E0 * rho^penalty

  • Clips to void value to avoid singularity

  • Uses penalty_schedule if provided

  • Uses beta_schedule if beta is callable

penalize_grad(rho: cupy.ndarray)[source]

Compute gradient of penalization function.

Parameters:

rho (cupy.ndarray) – Filtered densities, shape (n_elements,)

Returns:

Gradient dE/drho, shape (n_elements,)

Return type:

cupy.ndarray

Notes

  • Derivative of SIMP + Heaviside projection

  • Used in chain rule for sensitivity analysis

  • Accounts for penalty_schedule if provided

  • Accounts for beta_schedule if beta is callable

f(rho: cupy.ndarray | None = None)[source]

Compute objective function value (volume).

Parameters:

rho (cupy.ndarray, optional) – Design variables for linearization. If None, returns cached value

Returns:

Volume fraction (normalized by total domain volume). Lower is better.

Return type:

float

Notes

  • If rho provided: returns linearized approximation f(x) + df/dx @ (rho - x)

  • If rho is None: returns cached value from last set_desvars() call

  • Objective is to minimize material usage while satisfying compliance constraint

nabla_f(rho: cupy.ndarray | None = None)[source]

Compute objective function gradient (volume sensitivities).

Parameters:

rho (cupy.ndarray, optional) – Unused (for interface compatibility)

Returns:

Gradient of volume w.r.t. design variables, shape (n_vars,)

Return type:

cupy.ndarray

Notes

  • Constant gradient: 1/volume_total per element

  • All elements contribute equally to volume

  • Used by optimizers to minimize material usage

g(rho=None)[source]

Compute constraint values (compliance constraint violations).

Parameters:

rho (cupy.ndarray, optional) – Design variables. If None, uses current desvars

Returns:

Normalized compliance constraint violation. Negative = satisfied. g = (compliance - compliance_limit) / compliance_limit

Return type:

float

Notes

  • Constraint satisfied when g <= 0

  • Normalized by compliance_limit for better scaling

  • If rho provided, uses linearized approximation

nabla_g()[source]

Compute constraint gradients (compliance sensitivities).

Returns:

Gradient of compliance constraint w.r.t. design variables, shape (n_vars,)

Return type:

cupy.ndarray

Notes

  • Normalized by compliance_limit for better scaling

  • Uses adjoint method: dC/drho = -U^T @ dK/drho @ U

  • Includes filter adjoint: sens_raw = H^T @ sens_filtered

  • Used by optimizers to enforce compliance constraint

ill_conditioned()[source]

Check if FEA system is ill-conditioned.

Returns:

True if residual >= 1e-2 (indicates poor solver convergence)

Return type:

bool

Notes

  • Residual > 1e-2 suggests numerical issues (check void parameter, penalty)

  • May indicate near-singular stiffness matrix (too many void elements)

  • Consider increasing void parameter or reducing penalty

is_terminal()[source]

Check if penalty continuation has reached final value.

Returns:

True if penalty schedule is complete or not used

Return type:

bool

Notes

  • Used by optimizers to determine if continuation is finished

  • If penalty_schedule is None, always returns True

  • If penalty_schedule exists, checks if current iteration has reached final penalty

logs()[source]

Return diagnostic information for current iteration.

Returns:

Dictionary with keys: - ‘iteration’: Current iteration number - ‘residual’: FEA solver residual (||K@U - F|| / ||F||)

Return type:

dict

Notes

Used by optimizers to track convergence and diagnose issues.

class pyFANTOM.Problem.CUDA.WeightDistributionMinimumCompliance.WeightDistributionMinimumCompliance(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, target_centroid: List[float], maximum_distance: float, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Bases: Problem

Minimum compliance with weight distribution constraint (CUDA).

GPU-accelerated version of WeightDistributionMinimumCompliance. Extends MinimumCompliance with an additional constraint on the centroid location of the material distribution. Useful for balancing structures or controlling center of mass for dynamic applications. All computations performed on GPU using CuPy.

Parameters:
  • FE (FiniteElement) – CUDA finite element analysis engine with mesh, kernel, and solver

  • filter (StructuredFilter2D, StructuredFilter3D, or GeneralFilter) – CUDA density filter for ensuring minimum feature sizes

  • target_centroid (List[float]) – Target centroid location in physical coordinates, shape (2,) for 2D or (3,) for 3D

  • maximum_distance (float) – Maximum allowed distance from target centroid (constraint: ||centroid - target|| <= maximum_distance)

  • E_mul (list of float, optional) – Young’s modulus multipliers for each material (default: [1.0] for single material)

  • void (float, optional) – Minimum density to avoid singularity (default: 1e-6)

  • penalty (float, optional) – SIMP penalization exponent (default: 3.0). Higher = more binary designs

  • volume_fraction (list of float, optional) – Volume fraction constraint for each material (default: [0.25])

  • penalty_schedule (callable, optional) – Function(p, iteration) for penalty continuation. If None, uses constant penalty

  • heavyside (bool, optional) – Apply Heaviside projection for sharper 0-1 designs (default: True)

  • beta (float or callable, optional) – Heaviside projection sharpness parameter. Can be a float (default: 2) or a callable function of iteration: beta(iteration) -> float. Enables beta continuation for gradual Heaviside sharpening during optimization.

  • eta (float, optional) – Heaviside projection threshold (default: 0.5)

is_single_material

True for single material, False for multi-material optimization

Type:

bool

n_material

Number of materials

Type:

int

iteration

Current iteration count

Type:

int

target_centroid

Target centroid location

Type:

cupy.ndarray

maximum_distance

Maximum allowed distance from target

Type:

float

f()[source]

Compute compliance objective

nabla_f()[source]

Compute compliance sensitivities

g()[source]

Compute constraint values (volume + centroid distance)

nabla_g()[source]

Compute constraint gradients

get_desvars()[source]

Get current design variables

set_desvars(rho)[source]

Set design variables and trigger FEA solve

visualize_solution(**kwargs)[source]

Plot optimized design

Notes

Constraints: - Volume constraint: Same as MinimumCompliance - Centroid constraint: ||centroid(rho) - target_centroid|| <= maximum_distance - Constraints are coupled (not independent), so OC optimizer cannot be used

Centroid Computation: - Centroid = sum(rho[i] * volume[i] * centroid[i]) / sum(rho[i] * volume[i]) - Weighted by element density and volume

Use Cases: - Balancing structures for dynamic loads - Controlling center of mass - Symmetric designs with off-center loading - Multi-material designs with weight distribution requirements

GPU Acceleration: - All arrays stored as CuPy arrays on GPU - 5-10x faster than CPU for large problems

Examples

>>> from pyFANTOM.CUDA import *
>>> # Setup FEA
>>> mesh = StructuredMesh2D(nx=128, ny=64, lx=2.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = CG(kernel=kernel)
>>> FE = FiniteElement(mesh=mesh, kernel=kernel, solver=solver)
>>>
>>> # Apply BCs and loads
>>> FE.add_dirichlet_boundary_condition(node_ids=fixed_nodes, rhs=0)
>>> FE.add_point_forces(node_ids=load_nodes, forces=cp.array([[0, -1.0]]))
>>>
>>> # Define optimization problem with centroid constraint
>>> filter = StructuredFilter2D(mesh=mesh, r_min=1.5)
>>> problem = WeightDistributionMinimumCompliance(
>>>     FE=FE, filter=filter,
>>>     target_centroid=[1.0, 0.5],  # Center of 2D domain
>>>     maximum_distance=0.1,  # 10% of domain size
>>>     penalty=3.0, volume_fraction=[0.3]
>>> )
__init__(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, target_centroid: List[float], maximum_distance: float, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Initialize problem state.

Subclasses may accept finite-element handlers, filters, and other configuration parameters.

N()[source]

Return the number of design variables.

Returns:

Total number of design variables (n_elements * n_materials)

Return type:

int

m()[source]

Return the number of constraints.

Returns:

Number of constraints: - Single material: 2 (volume + centroid distance) - Multi-material: n_materials + 1 (volume per material + centroid distance)

Return type:

int

is_independent()[source]

Check if constraints are independent (required for OC optimizer).

Returns:

Always False. Centroid constraint couples all design variables, making constraints non-independent. Use MMA or PGD optimizer instead of OC.

Return type:

bool

constraint_map()[source]

Return mapping of constraints to design variables.

Returns:

  • Single material: 1 (scalar, all variables affect both constraints)

  • Multi-material: array of shape (n_materials + 1, n_vars) where: - Rows 0 to n_materials-1: volume constraints per material - Last row: centroid constraint (all variables = 1)

Return type:

int or ndarray

Notes

Used by optimizers to identify which design variables affect each constraint. The centroid constraint depends on all design variables, making constraints coupled.

bounds()[source]

Return bounds for design variables.

Returns:

(lower_bound, upper_bound) where both are 0.0 and 1.0 respectively

Return type:

tuple

Notes

Design variables represent normalized densities in [0, 1].

visualize_problem(**kwargs)[source]

Visualize problem setup (mesh, BCs, loads).

Parameters:

**kwargs – Arguments passed to FE.visualize_problem()

Returns:

Plot object

Return type:

matplotlib.axes.Axes or k3d.Plot

visualize_solution(**kwargs)[source]

Visualize optimized design (density distribution).

Parameters:

**kwargs – Arguments passed to FE.visualize_density()

Returns:

Plot object

Return type:

matplotlib.axes.Axes or k3d.Plot

Notes

Shows current design variables as density field. For multi-material problems, displays combined material distribution.

init_desvars()[source]

Initialize design variables with random smoothed distribution.

Uses random initialization followed by triple filtering to create a smooth initial guess. This helps avoid getting stuck in poor local minima that may violate the centroid constraint.

Notes

  • Random initialization: uniform distribution in [min(0, 2*vf-1), 1.0]

  • Triple filtering: applies filter 3 times for smoothness

  • Normalized to match volume fraction constraint

  • Resets iteration counter to 0

  • Triggers _compute() to evaluate objective and constraints

set_desvars(desvars: cupy.ndarray)[source]

Set design variables and recompute objective/constraints.

Parameters:

desvars (cupy.ndarray) – Design variables, shape (n_vars,). Values should be in [0, 1]

Raises:

ValueError – If desvars shape doesn’t match num_vars

Notes

  • Triggers FEA solve and sensitivity computation

  • Increments iteration counter

  • Updates cached values: _f, _g, _nabla_f, _nabla_g

  • Computes centroid and centroid constraint gradient

get_desvars()[source]

Get current design variables.

Returns:

Current design variables, shape (n_vars,)

Return type:

cupy.ndarray

Notes

Returns raw (unfiltered) design variables. For filtered densities, access via problem.filter.dot(problem.get_desvars()).

penalize(rho: cupy.ndarray)[source]

Apply SIMP penalization and Heaviside projection.

Parameters:

rho (cupy.ndarray) – Filtered densities, shape (n_elements,) or (n_elements, n_materials)

Returns:

Penalized Young’s moduli, shape (n_elements,)

Return type:

cupy.ndarray

Notes

  • Applies Heaviside projection if enabled

  • Applies SIMP: E = E0 * rho^penalty

  • Clips to void value to avoid singularity

  • Uses penalty_schedule if provided

  • Uses beta_schedule if beta is callable

  • For multi-material: uses material interpolation scheme

penalize_grad(rho: cupy.ndarray)[source]

Compute gradient of penalization function.

Parameters:

rho (cupy.ndarray) – Filtered densities, shape (n_elements,) or (n_elements, n_materials)

Returns:

Gradient dE/drho, shape (n_elements,) or (n_elements, n_materials)

Return type:

cupy.ndarray

Notes

  • Derivative of SIMP + Heaviside projection

  • Used in chain rule for sensitivity analysis

  • Accounts for penalty_schedule if provided

  • Accounts for beta_schedule if beta is callable

  • For multi-material: returns gradient per material

f(rho: cupy.ndarray | None = None)[source]

Compute objective function value (compliance).

Parameters:

rho (cupy.ndarray, optional) – Design variables for linearization. If None, returns cached value

Returns:

Compliance value (F^T @ U). Lower is better (stiffer structure).

Return type:

float

Notes

  • If rho provided: returns linearized approximation f(x) + df/dx @ (rho - x)

  • If rho is None: returns cached value from last set_desvars() call

  • Compliance = F^T @ U = U^T @ K @ U (strain energy)

nabla_f(rho: cupy.ndarray | None = None)[source]

Compute objective function gradient (compliance sensitivities).

Parameters:

rho (cupy.ndarray, optional) – Unused (for interface compatibility)

Returns:

Gradient of compliance w.r.t. design variables, shape (n_vars,)

Return type:

cupy.ndarray

Notes

  • Uses adjoint method: dC/drho = -U^T @ dK/drho @ U

  • Includes filter adjoint: sens_raw = H^T @ sens_filtered

  • Negative gradient means increasing density reduces compliance (good)

g(rho=None)[source]

Compute constraint values (volume + centroid distance violations).

Parameters:

rho (cupy.ndarray, optional) – Design variables. If None, uses current desvars

Returns:

Constraint violations, shape (n_constraints,). Negative = satisfied. - First n_materials constraints: volume fraction violations - Last constraint: centroid distance violation (distance^2 - maximum_distance^2)

Return type:

cupy.ndarray

Notes

  • Constraint satisfied when g <= 0

  • For single material: returns [volume_violation, centroid_violation]

  • For multi-material: returns [vol_viol_1, …, vol_viol_n, centroid_violation]

  • If rho provided, uses linearized approximation

nabla_g()[source]

Compute constraint gradients (volume + centroid sensitivities).

Returns:

Gradient of constraints w.r.t. design variables. - Single material: shape (2, n_vars) - [volume_grad, centroid_grad] - Multi-material: shape (n_materials + 1, n_vars) - [vol_grad_1, …, vol_grad_n, centroid_grad]

Return type:

cupy.ndarray

Notes

  • First rows: d(volume_fraction[i])/drho (constant for uniform elements)

  • Last row: d(centroid_distance^2)/drho (depends on element positions)

  • Centroid gradient accounts for weighted average of element centroids

  • Used by optimizers to enforce constraints

ill_conditioned()[source]

Check if FEA system is ill-conditioned.

Returns:

True if residual >= 1e-2 (indicates poor solver convergence)

Return type:

bool

Notes

  • Residual > 1e-2 suggests numerical issues (check void parameter, penalty)

  • May indicate near-singular stiffness matrix (too many void elements)

  • Consider increasing void parameter or reducing penalty

is_terminal()[source]

Check if penalty continuation has reached final value.

Returns:

True if penalty schedule is complete or not used

Return type:

bool

Notes

  • Used by optimizers to determine if continuation is finished

  • If penalty_schedule is None, always returns True

  • If penalty_schedule exists, checks if current iteration has reached final penalty

logs()[source]

Return diagnostic information for current iteration.

Returns:

Dictionary with keys: - ‘iteration’: Current iteration number - ‘residual’: FEA solver residual (||K@U - F|| / ||F||)

Return type:

dict

Notes

Used by optimizers to track convergence and diagnose issues.