CUDA Backend

The CUDA backend provides GPU-accelerated implementations of all core functionality for topology optimization.

CUDA backend public API.

Importing from this module gives access to GPU/CUDA implementations of core pyFANTOM components. The API mirrors the CPU backend so users can switch backends by changing the import path.

>>> from pyFANTOM.CUDA import StructuredMesh2D, FiniteElement, MinimumCompliance

Mesh Classes

pyFANTOM.geom.CUDA._mesh.CuStructuredMesh2D - CUDA 2D structured mesh
pyFANTOM.geom.CUDA._mesh.CuStructuredMesh3D - CUDA 3D structured mesh
pyFANTOM.geom.CUDA._mesh.CuGeneralMesh - CUDA general unstructured mesh

Filter Classes

pyFANTOM.geom.CUDA._filters.CuStructuredFilter2D - CUDA 2D structured density filter
pyFANTOM.geom.CUDA._filters.CuStructuredFilter3D - CUDA 3D structured density filter
pyFANTOM.geom.CUDA._filters.CuGeneralFilter - CUDA general density filter

Finite Element Classes

pyFANTOM.FiniteElement.CUDA.FiniteElement.FiniteElement - CUDA finite element analysis engine

Stiffness Kernel Classes

pyFANTOM.stiffness.CUDA._FEA.StructuredStiffnessKernel - CUDA structured mesh stiffness kernel
pyFANTOM.stiffness.CUDA._FEA.GeneralStiffnessKernel - CUDA general mesh stiffness kernel
pyFANTOM.stiffness.CUDA._FEA.UniformStiffnessKernel - CUDA uniform element stiffness kernel

Solver Classes

pyFANTOM.solvers.CUDA._solvers.CG - CUDA Conjugate Gradient iterative solver
pyFANTOM.solvers.CUDA._solvers.GMRES - CUDA GMRES iterative solver
pyFANTOM.solvers.CUDA._solvers.SPSOLVE - CUDA sparse direct solver
pyFANTOM.solvers.CUDA._solvers.MultiGrid - CUDA multigrid iterative solver

Optimizer Classes

pyFANTOM.Optimizers.CUDA.MMA.MMA - CUDA Method of Moving Asymptotes optimizer
pyFANTOM.Optimizers.CUDA.OC.OC - CUDA Optimality Criteria optimizer
pyFANTOM.Optimizers.CUDA.PGD.PGD - CUDA Projected Gradient Descent optimizer

Problem Classes

pyFANTOM.Problem.CUDA.MinimumCompliance.MinimumCompliance - CUDA minimum compliance problem
pyFANTOM.Problem.CUDA.ComplianceConstrainedMinimumVolume.ComplianceConstrainedMinimumVolume - CUDA volume-constrained compliance problem
pyFANTOM.Problem.CUDA.WeightDistributionMinimumCompliance.WeightDistributionMinimumCompliance - CUDA weight distribution problem

Detailed Documentation

class pyFANTOM.geom.CUDA._mesh.CuStructuredMesh2D(nx, ny, lx, ly, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]

Bases: StructuredMesh

CUDA-accelerated 2D structured mesh with uniform rectangular elements.

GPU version of StructuredMesh2D using CuPy arrays for all data storage. Identical API to CPU version but operates on GPU memory for efficient assembly and computation.

Parameters:

nx (int) – Number of elements in x-direction
ny (int) – Number of elements in y-direction
lx (float) – Physical length of domain in x-direction
ly (float) – Physical length of domain in y-direction
dtype (np.dtype, optional) – Data type for arrays (default: np.float64)
physics (Physx, optional) – Physics model defining material behavior (default: LinearElasticity(E=1.0, nu=1/3))

Notes

All arrays are stored as CuPy arrays on GPU
Element stiffness matrix is computed on CPU then transferred to GPU
Use with CuStructuredStiffnessKernel for GPU-accelerated assembly
Requires CUDA-capable GPU and CuPy installation

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D
>>> from pyFANTOM import LinearElasticity
>>> mesh = StructuredMesh2D(nx=128, ny=64, lx=2.0, ly=1.0)
>>> print(f"GPU memory usage: {mesh.nodes.nbytes + mesh.elements.nbytes} bytes")

__init__(nx, ny, lx, ly, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]

class pyFANTOM.geom.CUDA._mesh.CuStructuredMesh3D(nx, ny, nz, lx, ly, lz, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]

Bases: StructuredMesh

CUDA-accelerated 3D structured mesh with uniform hexahedral elements.

GPU version of StructuredMesh3D using CuPy arrays for all data storage. Identical API to CPU version but operates on GPU memory for efficient 3D topology optimization.

Parameters:

nx (int) – Number of elements in x-direction
ny (int) – Number of elements in y-direction
nz (int) – Number of elements in z-direction
lx (float) – Physical length of domain in x-direction
ly (float) – Physical length of domain in y-direction
lz (float) – Physical length of domain in z-direction
dtype (np.dtype, optional) – Data type for arrays (default: np.float64)
physics (Physx, optional) – Physics model defining material behavior (default: LinearElasticity(E=1.0, nu=1/3))

Notes

All arrays are stored as CuPy arrays on GPU
Use MultiGrid solver for large 3D problems to manage GPU memory
Element stiffness matrix computed on CPU then transferred
Requires sufficient GPU memory for 3D problems (typically >8GB for 100^3 elements)

Examples

>>> from pyFANTOM.CUDA import StructuredMesh3D
>>> mesh = StructuredMesh3D(nx=64, ny=64, nz=32, lx=1.0, ly=1.0, lz=0.5)
>>> print(f"Total DOF: {len(mesh.nodes) * mesh.dof}")

__init__(nx, ny, nz, lx, ly, lz, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]

class pyFANTOM.geom.CUDA._mesh.CuGeneralMesh(nodes, elements, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]

Bases: GeneralMesh

CUDA-accelerated general unstructured mesh.

GPU version of GeneralMesh. Inherits CPU mesh initialization logic then transfers all arrays to GPU memory. Supports both uniform and heterogeneous element topologies.

Parameters:

nodes (ndarray) – Node coordinates (CPU array), shape (n_nodes, spatial_dim)
elements (list or ndarray) – Element connectivity (CPU array or list)
dtype (np.dtype, optional) – Data type for arrays (default: np.float64)
physics (Physx, optional) – Physics model (default: LinearElasticity(E=1.0, nu=1/3))

Notes

Mesh processing (node cleanup, stiffness computation) happens on CPU
Resulting arrays are transferred to GPU after initialization
Use CuGeneralStiffnessKernel for GPU-accelerated assembly
For large unstructured meshes, GPU offers significant speedup over CPU

Examples

>>> from pyFANTOM.CUDA import GeneralMesh
>>> import numpy as np
>>> nodes = np.array([[0,0], [1,0], [0,1], [1,1]])
>>> elements = np.array([[0,1,2], [1,3,2]])
>>> mesh = GeneralMesh(nodes, elements)

__init__(nodes, elements, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]

class pyFANTOM.geom.CUDA._filters.CuStructuredFilter2D(mesh: CuStructuredMesh2D, r_min)[source]

Bases: FilterKernel

CUDA-accelerated 2D density filter for structured meshes.

GPU version of StructuredFilter2D using CuPy. Fast convolution-based filtering on GPU. Identical API to CPU version but operates on GPU memory for maximum performance.

Parameters:

mesh (CuStructuredMesh2D) – CUDA 2D structured mesh
r_min (float) – Filter radius in element units

shape

Filter matrix dimensions (n_elements, n_elements)

Type:: tuple

weights

Precomputed filter weights on GPU

Type:: cupy.ndarray

offsets

Element index offsets for neighbors on GPU

Type:: cupy.ndarray

normalizer

Per-element normalization for adjoint on GPU

Type:: cupy.ndarray

dot(rho): Forward filter application

_rmatvec(sens)[source]: Adjoint filter for sensitivity backpropagation

Notes

Uses cone filter: w(d) = (r_min - d) / r_min
Handles non-square elements (dx != dy) automatically
Typical r_min values: 1.5-3.0 element units
Larger r_min produces smoother, coarser designs
All arrays stored on GPU as CuPy arrays
GPU-accelerated filtering for maximum performance
Requires CUDA-capable GPU and CuPy

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredFilter2D
>>> mesh = StructuredMesh2D(nx=128, ny=64, lx=2.0, ly=1.0)
>>> filter = StructuredFilter2D(mesh=mesh, r_min=2.0)
>>> rho_smooth = filter.dot(rho)

__init__(mesh: CuStructuredMesh2D, r_min)[source]

class pyFANTOM.geom.CUDA._filters.CuStructuredFilter3D(mesh: CuStructuredMesh3D, r_min)[source]

Bases: FilterKernel

CUDA-accelerated 3D density filter for structured meshes.

GPU version of StructuredFilter3D using CuPy. All filtering operations on GPU for maximum performance in 3D topology optimization. Identical API to CPU version but operates on GPU memory.

Parameters:

mesh (CuStructuredMesh3D) – CUDA 3D structured mesh
r_min (float) – Filter radius in element units (e.g., r_min=1.5 includes elements within 1.5 element widths)

shape

Filter matrix dimensions (n_elements, n_elements)

Type:: tuple

weights

Precomputed filter weights for neighbor elements on GPU

Type:: cupy.ndarray

offsets

Element index offsets for neighbors within filter radius on GPU

Type:: cupy.ndarray

normalizer

Per-element normalization factors for adjoint operation on GPU

Type:: cupy.ndarray

dot(rho): Apply forward filter: filtered_rho = H @ rho

_rmatvec(sens)[source]: Apply adjoint filter for sensitivity backpropagation: H^T @ sens

Notes

Filter uses cone-shaped kernel: w(d) = (r_min - d) / r_min for d < r_min
Automatically handles anisotropic elements (dx != dy != dz)
Adjoint includes proper normalization for optimization gradients
Memory-efficient: stores only unique weights and offsets, not full matrix
All arrays stored on GPU as CuPy arrays
5-10x faster than CPU for large 3D problems
Essential for large-scale 3D optimization
Requires CUDA-capable GPU and CuPy

Examples

>>> from pyFANTOM.CUDA import StructuredMesh3D, StructuredFilter3D
>>> mesh = StructuredMesh3D(nx=64, ny=64, nz=32, lx=1.0, ly=1.0, lz=0.5)
>>> filter = StructuredFilter3D(mesh=mesh, r_min=1.5)
>>> rho_filtered = filter.dot(rho_raw)

__init__(mesh: CuStructuredMesh3D, r_min)[source]

class pyFANTOM.geom.CUDA._filters.CuGeneralFilter(mesh: CuGeneralMesh, r_min)[source]

Bases: FilterKernel

CUDA-accelerated density filter for unstructured meshes.

GPU version of GeneralFilter using CuPy sparse matrices. Implements density filtering for general unstructured meshes using spatial search. Stores explicit sparse filter matrix.

Parameters:

mesh (CuGeneralMesh) – CUDA-accelerated general unstructured mesh
r_min (float) – Filter radius in physical units (not element units)

kernel

Explicit filter matrix stored on GPU, shape (n_elements, n_elements)

Type:: cupyx.scipy.sparse.csr_matrix

shape

Filter matrix dimensions

Type:: tuple

dot(rho): Forward filter: H @ rho

_rmatvec(sens)[source]: Adjoint filter: H^T @ sens

Notes

Uses spatial search (KDTree on CPU) to build filter, then transfers to GPU
Stores explicit sparse matrix (memory-intensive for large meshes)
r_min is in physical units, not element-relative
Automatically detects 2D vs 3D from mesh
GPU acceleration provides 3-5x speedup over CPU for large meshes

Examples

>>> from pyFANTOM.CUDA import GeneralMesh, GeneralFilter
>>> mesh = GeneralMesh(nodes, elements)  # Triangular mesh
>>> filter = GeneralFilter(mesh=mesh, r_min=0.05)  # Physical radius
>>> rho_filtered = filter.dot(rho)

__init__(mesh: CuGeneralMesh, r_min)[source]

Bases: FiniteElement

CUDA-accelerated finite element analysis engine.

GPU version of FiniteElement using CuPy for all computations. Provides identical API to CPU version but with 5-10x performance improvement for large problems. All arrays stored on GPU memory.

Parameters:

mesh (StructuredMesh2D, StructuredMesh3D, or GeneralMesh) – CUDA mesh defining geometry and physics
kernel (StructuredStiffnessKernel, GeneralStiffnessKernel, or UniformStiffnessKernel) – CUDA stiffness assembly kernel
solver (CG, GMRES, SPSOLVE, or MultiGrid) – CUDA linear system solver

mesh

Associated CUDA mesh

Type:: Mesh

kernel

Stiffness assembly kernel

Type:: StiffnessKernel

solver

Linear solver

Type:: Solver

rhs

Right-hand side force vector on GPU, shape (n_nodes * dof,)

Type:: cupy.ndarray

d_rhs

Prescribed displacement values for Dirichlet BCs, shape (n_nodes * dof,)

Type:: cupy.ndarray

dof

Degrees of freedom per node (2 for 2D, 3 for 3D)

Type:: int

is_3D

True for 3D problems

Type:: bool

nel

Number of elements

Type:: int

Notes

All arrays stored as CuPy arrays on GPU
KDTree searches performed on CPU (temporary transfer)
Visualization methods transfer data to CPU automatically
Requires CUDA-capable GPU and CuPy

Examples

>>> from pyFANTOM.CUDA import *
>>> mesh = StructuredMesh2D(nx=256, ny=128, lx=2.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = CG(kernel=kernel)
>>> FE = FiniteElement(mesh=mesh, kernel=kernel, solver=solver)
>>>
>>> # Apply BCs and loads
>>> FE.add_dirichlet_boundary_condition(node_ids=fixed_nodes, rhs=0)
>>> FE.add_point_forces(node_ids=load_nodes, forces=cp.array([[0, -1.0]]))
>>>
>>> # Solve
>>> U, residual = FE.solve(rho=cp.ones(nel) * 0.5)

Initialize CUDA finite element analysis engine.

Parameters:

mesh (StructuredMesh2D, StructuredMesh3D, or GeneralMesh) – CUDA mesh defining geometry and physics
kernel (StructuredStiffnessKernel, GeneralStiffnessKernel, or UniformStiffnessKernel) – CUDA stiffness assembly kernel
solver (CG, GMRES, SPSOLVE, or MultiGrid) – CUDA linear system solver

Notes

Initializes force vector (rhs) and Dirichlet BC storage (d_rhs) on GPU. All arrays remain on GPU throughout computation.

add_dirichlet_boundary_condition(node_ids: cupy.ndarray | None = None, positions: cupy.ndarray | None = None, dofs: cupy.ndarray | None = None, rhs: float | cupy.ndarray = 0.0)[source]

Apply Dirichlet (fixed displacement) boundary conditions.

Parameters:

node_ids (cp.ndarray, optional) – Node indices to constrain on GPU, shape (n_nodes,)
positions (cp.ndarray, optional) – Physical coordinates to constrain (uses KDTree search on CPU), shape (n_nodes, spatial_dim)
dofs (cp.ndarray, optional) – DOF mask per node, shape (n_nodes, dof) with 1=constrained, 0=free. If None, all DOFs at specified nodes are constrained
rhs (float or cp.ndarray, optional) – Prescribed displacement values. Scalar for zero displacement, or array shape (n_nodes, dof)

Notes

Provide either node_ids OR positions, not both
dofs array: [[1,0]] constrains only x-displacement in 2D
Multiple calls accumulate constraints
Modifies kernel.constraints boolean array on GPU
KDTree search performed on CPU (nodes transferred temporarily)

add_neumann_boundary_condition(**kwargs)[source]

Apply Neumann boundary conditions (not implemented).

Raises:: NotImplementedError – Always raised. Neumann boundary conditions not yet implemented for CUDA FEA.

add_point_forces(forces: cupy.ndarray, node_ids: cupy.ndarray | None = None, positions: cupy.ndarray | None = None)[source]

Apply point loads to specified nodes.

Parameters:

forces (cp.ndarray) – Force vectors on GPU, shape (n_forces, dof) where dof is 2 for 2D or 3 for 3D. For 2D: [[fx, fy]], for 3D: [[fx, fy, fz]]
node_ids (cp.ndarray, optional) – Node indices for force application on GPU, shape (n_forces,)
positions (cp.ndarray, optional) – Physical coordinates for force application (uses KDTree search on CPU)

Notes

Provide either node_ids OR positions
Multiple calls accumulate forces
Forces are automatically set to prescribed values at Dirichlet nodes
Units should match physics model (e.g., Newtons for SI units)
KDTree search performed on CPU (nodes transferred temporarily)

reset_forces()[source]

Clear all applied forces.

Sets the right-hand side force vector to zero on GPU. Useful when reconfiguring loading conditions or starting a new analysis.

reset_dirichlet_boundary_conditions()[source]

Remove all Dirichlet boundary conditions.

Clears all fixed displacement constraints and resets the kernel’s constraint state. Useful when reconfiguring boundary conditions.

visualize_problem(ax=None, **kwargs)[source]

Visualize mesh with boundary conditions and loads.

Parameters:

ax (matplotlib.axes.Axes, optional) – Existing axes to plot on (2D only). If None, creates new figure
**kwargs – Additional arguments passed to plot_problem_2D/3D

Returns:

Plot object (2D: matplotlib axes, 3D: k3d plot)

Return type:

matplotlib.axes.Axes or k3d.Plot

Notes

GPU arrays are transferred to CPU for visualization (.get() calls)
Shows mesh elements, fixed nodes, and applied forces

visualize_density(rho, ax=None, **kwargs)[source]

Visualize density distribution (optimization result).

Parameters:

rho (cp.ndarray or ndarray) – Element densities on GPU or CPU, shape (n_elements,) or (n_elements, n_materials)
ax (matplotlib.axes.Axes, optional) – Existing axes to plot on (2D only). If None, creates new figure
**kwargs – Additional arguments passed to plot_problem_2D/3D

Returns:

Plot object (2D: matplotlib axes, 3D: k3d plot)

Return type:

matplotlib.axes.Axes or k3d.Plot

Notes

GPU arrays are automatically transferred to CPU for visualization
Color-codes elements by density value (0=void, 1=solid)

visualize_field(field, ax=None, rho=None, **kwargs)[source]

Visualize scalar or vector field (displacement, stress, strain, etc.).

Parameters:

field (cp.ndarray or ndarray) – Field values to plot on GPU or CPU. Shape depends on field type: - Scalar: (n_elements,) or (n_nodes,) - Vector: (n_nodes, dof) for displacement-like fields
ax (matplotlib.axes.Axes, optional) – Existing axes to plot on (2D only). If None, creates new figure
rho (cp.ndarray or ndarray, optional) – Density mask to hide void regions, shape (n_elements,)
**kwargs – Additional arguments passed to plot_field_2D/3D

Returns:

Plot object (2D: matplotlib axes, 3D: k3d plot)

Return type:

matplotlib.axes.Axes or k3d.Plot

Notes

GPU arrays are automatically transferred to CPU for visualization
Common fields: displacement, Von Mises stress, strain energy
If rho provided, elements with rho < 0.5 are hidden

solve(rho=None)[source]

Solve finite element system: K(rho) @ U = F on GPU.

Parameters:

rho (cp.ndarray, optional) – Element density variables on GPU, shape (n_elements,). If None, uses rho=1 (full density)

Returns:

U (cp.ndarray) – Displacement vector on GPU, shape (n_nodes * dof,). Interleaved DOFs: [ux0, uy0, ux1, uy1, …]
residual (float) – Normalized residual ||K@U - F|| / ||F|| for solution validation

Notes

residual < 1e-5 indicates accurate solution
residual > 1e-2 indicates ill-conditioned system
All operations performed on GPU for maximum performance
Solver reuses factorization if available
For topology optimization, rho represents material distribution

class pyFANTOM.stiffness.CUDA._FEA.StructuredStiffnessKernel(mesh: CuStructuredMesh2D | CuStructuredMesh3D)[source]

Bases: StiffnessKernel

CUDA-accelerated stiffness kernel for structured meshes.

GPU version of CPU StructuredStiffnessKernel using CuPy. Uses node-basis assembly with a single precomputed element stiffness matrix for maximum efficiency on uniform grid meshes. Identical API to CPU version but operates entirely on GPU memory for maximum performance.

Parameters:: mesh (CuStructuredMesh2D or CuStructuredMesh3D) – CUDA-accelerated structured mesh with uniform elements

K_single

Single element stiffness matrix used for all elements (on GPU)

Type:: cupy.ndarray

shape

Global stiffness matrix dimensions (n_dof, n_dof)

Type:: tuple

constraints

Boolean array marking constrained DOFs (on GPU)

Type:: cupy.ndarray (bool)

has_cons

True if boundary conditions have been applied

Type:: bool

non_con_map

Index map to non-constrained DOFs for reduced system (on GPU)

Type:: cupy.ndarray

set_rho(rho): Set design variables for subsequent matrix-free operations

dot(rhs)[source]: Matrix-vector product K @ rhs (requires rho to be set)

construct(rho)[source]: Build explicit CSR matrix representation

diagonal(rho=None)[source]: Extract diagonal of stiffness matrix

add_constraints(dof_indices)[source]: Apply Dirichlet boundary conditions

process_grad(U)[source]: Compute element-wise sensitivities dK/drho : U for adjoint method

Notes

Fastest kernel for structured meshes (2-3x faster than UniformStiffnessKernel on GPU)
All arrays stored as CuPy arrays on GPU
5-10x faster than CPU for large 3D problems
Automatically handles constrained DOFs by zeroing rows/columns
Matrix-free dot() operation uses optimized CUDA kernels
Use construct() only when explicit matrix is needed (e.g., for direct solvers)
Requires CUDA-capable GPU and CuPy
Use with CUDA solvers (CG, MultiGrid)
Memory limited by GPU VRAM

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel
>>> mesh = StructuredMesh2D(nx=256, ny=128, lx=2.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> kernel.set_rho(rho)
>>> u = kernel.dot(f)  # Matrix-free K @ f on GPU

__init__(mesh: CuStructuredMesh2D | CuStructuredMesh3D)[source]

Initialize base stiffness kernel.

Sets up attributes for density storage and matrix operation aliases. Subclasses should call super().__init__() and set mesh-specific attributes.

diagonal(rho=None)[source]

Extract diagonal of assembled stiffness matrix.

Parameters:: rho (cp.ndarray, optional) – Design densities. If None, uses stored rho
Returns:: diag – Diagonal entries of K(rho)
Return type:: cp.ndarray

Notes

Used for Jacobi preconditioning in iterative solvers
Cached on subsequent calls with same rho

set_constraints(constraints)[source]

Set Dirichlet boundary conditions (replaces existing constraints).

Parameters:: constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)

Notes

Clears all existing constraints and sets only the specified DOFs. Use add_constraints() to add constraints incrementally.

add_constraints(constraints)[source]

Add Dirichlet boundary conditions (accumulates with existing constraints).

Parameters:: constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)

Notes

Adds to existing constraints. Use set_constraints() to replace all constraints.

process_grad(U)[source]

Compute element-wise compliance sensitivities dC/drho.

Parameters:: U (cp.ndarray) – Displacement vector from FEA solve on GPU, shape (n_dof,)
Returns:: dk – Element sensitivities on GPU (compliance derivative per element), shape (n_elements,)
Return type:: cp.ndarray

Notes

Computes: dk[e] = -U[e]^T @ K_e @ U[e]
Used for compliance minimization gradient
Negative sign convention for minimization
All operations performed on GPU

construct(rho)[source]

Build explicit CSR sparse matrix representation on GPU.

Parameters:: rho (cp.ndarray) – Design variables on GPU, shape (n_elements,)
Returns:: Sparse stiffness matrix in CSR format on GPU
Return type:: cupyx.scipy.sparse.csr_matrix

Notes

Memory-intensive: O(nnz) storage on GPU
Required for direct solvers (SPSOLVE)
Reuses sparsity pattern on subsequent calls (faster)

dot(rhs)[source]

Matrix-free matrix-vector or matrix-matrix product K(rho) @ rhs.

Parameters:

rhs (cp.ndarray or cp.sparse.csr_matrix) – Right-hand side vector or matrix

Returns:

result – Product K(rho) @ rhs

Return type:

cp.ndarray or cp.sparse.csr_matrix

Raises:

ValueError – If rho not set or shape mismatch
NotImplementedError – If rhs type not supported

Notes

Requires rho set via set_rho() first
Matrix-free: No explicit matrix storage
Faster than constructing full matrix for large problems

class pyFANTOM.stiffness.CUDA._FEA.GeneralStiffnessKernel(mesh: CuGeneralMesh)[source]

Bases: StiffnessKernel

CUDA-accelerated stiffness kernel for heterogeneous meshes.

GPU version of CPU GeneralStiffnessKernel. Most general kernel supporting meshes with elements of varying sizes and topologies. Uses flat storage with pointer arrays for memory-efficient representation. Identical API to CPU version but operates entirely on GPU.

Parameters:: mesh (CuGeneralMesh) – CUDA general mesh with potentially heterogeneous elements (mesh.is_uniform can be False)

K_flat

Flattened element stiffness matrices (on GPU)

Type:: cupy.ndarray

K_ptr

Pointer array for accessing K_flat: K_e = K_flat[K_ptr[e]:K_ptr[e+1]] (on GPU)

Type:: cupy.ndarray

elements_flat

Flattened element connectivity (on GPU)

Type:: cupy.ndarray

elements_ptr

Pointer array for accessing elements (on GPU)

Type:: cupy.ndarray

element_sizes

Number of nodes per element, shape (n_elements,) (on GPU)

Type:: cupy.ndarray

shape

Global stiffness matrix dimensions (n_dof, n_dof)

Type:: tuple

constraints

Boolean array for constrained DOFs (on GPU)

Type:: cupy.ndarray (bool)

has_cons

True if boundary conditions have been applied

Type:: bool

Same interface as StructuredStiffnessKernel (set_rho, dot, construct, diagonal, etc.)

Notes

Supports mixed element types (triangles + quads, tets + hexes, etc.)
Flat storage with pointers enables variable element sizes
Slower than Uniform/Structured kernels due to indirection
Automatically used if mesh.is_uniform == False
All arrays stored as CuPy arrays on GPU
All operations on GPU using CuPy
Use with CUDA solvers (CG, MultiGrid) for best performance
Memory limited by GPU VRAM

Examples

>>> from pyFANTOM.CUDA import GeneralMesh, GeneralStiffnessKernel
>>> import numpy as np
>>> # Mixed triangular and quad mesh
>>> nodes = np.array([[0,0], [1,0], [0,1], [1,1], [2,0]])
>>> elements = [np.array([0,1,2]), np.array([1,4,3,2])]  # Triangle + Quad
>>> mesh = GeneralMesh(nodes, elements)
>>> kernel = GeneralStiffnessKernel(mesh=mesh)

__init__(mesh: CuGeneralMesh)[source]

Initialize base stiffness kernel.

Sets up attributes for density storage and matrix operation aliases. Subclasses should call super().__init__() and set mesh-specific attributes.

diagonal(rho=None)[source]

Compute diagonal entries of stiffness matrix.

Parameters:: rho (cupy.ndarray, optional) – Element density variables, shape (n_elements,). If None, uses cached self.rho
Returns:: Diagonal entries of K(rho), shape (n_dof,)
Return type:: cupy.ndarray

Notes

Uses GPU-accelerated flat storage assembly
Respects Dirichlet boundary conditions (constrained DOFs zeroed)
Caches result in self.diag for reuse
Used for preconditioning and diagnostics

set_constraints(constraints)[source]

Set Dirichlet boundary condition DOF (replaces existing constraints).

Parameters:: constraints (cp.ndarray or array-like) – Indices of constrained DOF

Notes

Resets all previous constraints
Use add_constraints() to append to existing constraints
Constrained rows/columns are zeroed in matrix operations

add_constraints(constraints)[source]

Add Dirichlet boundary conditions (accumulates with existing constraints).

Parameters:: constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)

Notes

Adds to existing constraints. Use set_constraints() to replace all constraints. - Use set_constraints() to replace all constraints

process_grad(U)[source]

Compute element-wise compliance sensitivities dC/drho.

Parameters:: U (cp.ndarray) – Displacement vector from FEA solve
Returns:: dk – Element sensitivities (compliance derivative per element)
Return type:: cp.ndarray

Notes

Computes: dk[e] = -U[e]^T @ K_e @ U[e]
Used for compliance minimization gradient
Negative sign convention for minimization

construct(rho)[source]

Construct CSR stiffness matrix representation.

Parameters:: rho (cupy.ndarray) – Element density variables, shape (n_elements,)
Returns:: Sparse CSR stiffness matrix K(rho), shape (n_dof, n_dof)
Return type:: cupy.sparse.csr_matrix

Notes

Assembles full sparse matrix structure (first call) or updates values (subsequent)
Uses matrix-matrix product with identity to extract sparsity pattern
Caches sparsity pattern (ptr) for efficient updates
Sets has_been_constructed flag after first call
More memory-intensive than matrix-free operations but needed for some solvers

dot(rhs)[source]

Matrix-vector or matrix-matrix product K @ rhs.

Parameters:

rhs (cupy.ndarray or cupy.sparse.csr_matrix) – Right-hand side vector or matrix - Vector: shape (n_dof,) - Matrix: shape (n_dof, n_cols) or sparse CSR matrix

Returns:

Result K @ rhs, same type and shape as rhs (except columns)

Return type:

cupy.ndarray or cupy.sparse.csr_matrix

Raises:

ValueError – If rho not set or rhs shape mismatch
NotImplementedError – If rhs is not cupy array or sparse matrix

Notes

Uses cached self.rho if available
Automatically detects vector vs matrix input
Converts sparse matrices to CSR format if needed
Main interface for matrix operations

class pyFANTOM.stiffness.CUDA._FEA.UniformStiffnessKernel(mesh: CuGeneralMesh)[source]

Bases: StiffnessKernel

CUDA-accelerated stiffness kernel for uniform unstructured meshes.

GPU version of CPU UniformStiffnessKernel. Handles unstructured meshes where all elements have the same number of nodes and identical geometry, using element-specific stiffness matrices. More general than StructuredStiffnessKernel but less efficient. Identical API to CPU version but operates entirely on GPU memory.

Parameters:: mesh (CuGeneralMesh) – CUDA general mesh with uniform element topology (mesh.is_uniform must be True)

Ks

Element stiffness matrices, shape (n_elements, dof_per_element, dof_per_element) (on GPU)

Type:: cupy.ndarray

shape

Global stiffness matrix dimensions (n_dof, n_dof)

Type:: tuple

constraints

Boolean array for constrained DOFs (on GPU)

Type:: cupy.ndarray (bool)

has_cons

True if boundary conditions have been applied

Type:: bool

Same interface as StructuredStiffnessKernel (set_rho, dot, construct, diagonal, etc.)

Notes

Requires mesh.is_uniform == True (all elements same size)
Stores per-element stiffness matrices (more memory than StructuredStiffnessKernel)
Use for unstructured uniform meshes (e.g., triangular or tetrahedral)
For structured grids, prefer StructuredStiffnessKernel for better performance
All arrays stored as CuPy arrays on GPU
All operations on GPU using CuPy
Use with CUDA solvers (CG, MultiGrid) for best performance
Memory limited by GPU VRAM

Examples

>>> from pyFANTOM.CUDA import GeneralMesh, UniformStiffnessKernel
>>> import numpy as np
>>> # Triangular mesh
>>> nodes = np.array([[0,0], [1,0], [0,1], [1,1]])
>>> elements = np.array([[0,1,2], [1,3,2]])
>>> mesh = GeneralMesh(nodes, elements)
>>> kernel = UniformStiffnessKernel(mesh=mesh)

__init__(mesh: CuGeneralMesh)[source]

Initialize base stiffness kernel.

Sets up attributes for density storage and matrix operation aliases. Subclasses should call super().__init__() and set mesh-specific attributes.

diagonal(rho=None)[source]

Extract diagonal of stiffness matrix.

Parameters:: rho (cp.ndarray, optional) – Design variables on GPU. If None, uses self.rho (must be set via set_rho())
Returns:: Diagonal entries on GPU, shape (n_dof,)
Return type:: cp.ndarray
Raises:: ValueError – If rho is None and has_rho is False

Notes

Useful for Jacobi preconditioning or diagonal scaling in iterative solvers.

set_constraints(constraints)[source]

Set Dirichlet boundary conditions (replaces existing constraints).

Parameters:: constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)

Notes

Clears all existing constraints and sets only the specified DOFs. Use add_constraints() to add constraints incrementally.

add_constraints(constraints)[source]

Add Dirichlet boundary conditions (accumulates with existing constraints).

Parameters:: constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)

Notes

Adds to existing constraints. Use set_constraints() to replace all constraints.

process_grad(U)[source]

Compute element-wise compliance sensitivities dC/drho.

Parameters:: U (cp.ndarray) – Displacement vector from FEA solve on GPU, shape (n_dof,)
Returns:: dk – Element sensitivities on GPU, shape (n_elements,)
Return type:: cp.ndarray

Notes

Computes -U^T @ dK/drho @ U for each element, used in compliance sensitivity analysis. All operations performed on GPU.

construct(rho)[source]

Build explicit CSR sparse matrix representation on GPU.

Parameters:: rho (cp.ndarray) – Design variables on GPU, shape (n_elements,)
Returns:: Sparse stiffness matrix in CSR format on GPU
Return type:: cupyx.scipy.sparse.csr_matrix

Notes

Memory-intensive: O(nnz) storage on GPU
Required for direct solvers (SPSOLVE)
Reuses sparsity pattern on subsequent calls (faster)

dot(rhs)[source]

Matrix-vector or matrix-matrix product K @ rhs.

Parameters:

rhs (cupy.ndarray or cupy.sparse.csr_matrix) – Right-hand side vector or matrix - Vector: shape (n_dof,) - Matrix: shape (n_dof, n_cols) or sparse CSR matrix

Returns:

Result K @ rhs, same type and shape as rhs (except columns)

Return type:

cupy.ndarray or cupy.sparse.csr_matrix

Raises:

ValueError – If rho not set or rhs shape mismatch
NotImplementedError – If rhs is not cupy array or sparse matrix

Notes

Uses cached self.rho if available
Automatically detects vector vs matrix input
Converts sparse matrices to CSR format if needed
Main interface for matrix operations

class pyFANTOM.solvers.CUDA._solvers.CG(kernel: StiffnessKernel, maxiter=1000, tol=1e-05, matrix_free=None)[source]

Bases: Solver

CUDA-accelerated Conjugate Gradient iterative solver.

GPU version of the CG solver using CuPy. Efficiently solves large sparse symmetric positive definite linear systems Kx=b on GPU with optional matrix-free operation.

Parameters:

kernel (CuStiffnessKernel) – CUDA stiffness kernel (matrix operator)
maxiter (int, optional) – Maximum number of CG iterations (default: 1000)
tol (float, optional) – Convergence tolerance for relative residual (default: 1e-5)
matrix_free (bool, optional) – Whether to use matrix-free mode. If None, automatically determined: - True for StructuredStiffnessKernel (recommended) - False for General/UniformStiffnessKernel (faster)

Notes

Matrix-free mode: Lower memory, slower per-iteration
Assembled matrix mode: Higher memory, faster per-iteration
For structured meshes, matrix-free is preferred
For general meshes, assembled matrix is faster

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel, CG
>>> mesh = StructuredMesh2D(nx=256, ny=256, lx=1.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = CG(kernel=kernel, maxiter=1000, tol=1e-6)

__init__(kernel: StiffnessKernel, maxiter=1000, tol=1e-05, matrix_free=None)[source]

solve(rhs, rho=None, use_last=True)[source]

Solve linear system K(rho) @ U = rhs on GPU.

Parameters:

rhs (cp.ndarray) – Right-hand side force vector on GPU, shape (n_dof,)
rho (cp.ndarray, optional) – Design variables (densities) on GPU, shape (n_elements,)
use_last (bool, optional) – Use previous solution as initial guess (warm-starting), default: True

Returns:

U (cp.ndarray) – Solution vector on GPU, shape (n_dof,)
residual (float) – Relative residual: ||K@U - rhs|| / ||rhs||

Raises:

ValueError – If rho is not provided and kernel.has_rho is False

Notes

All operations performed on GPU using CuPy
Warm-starting (use_last=True) accelerates convergence
Matrix-free mode uses kernel.dot() directly
Assembled mode constructs explicit CSR matrix first

class pyFANTOM.solvers.CUDA._solvers.GMRES(kernel: StiffnessKernel, maxiter=1000, tol=1e-05, matrix_free=None)[source]

Bases: Solver

CUDA-accelerated Generalized Minimal Residual iterative solver.

GPU version of GMRES solver using CuPy. Solves non-symmetric linear systems Kx=b on GPU. More general than CG but typically slower and more memory-intensive.

Parameters:

kernel (CuStiffnessKernel) – CUDA stiffness kernel (matrix operator)
maxiter (int, optional) – Maximum number of GMRES iterations (default: 1000)
tol (float, optional) – Convergence tolerance for relative residual (default: 1e-5)
matrix_free (bool, optional) – Whether to use matrix-free mode. If None, automatically determined: - True for StructuredStiffnessKernel - False for General/UniformStiffnessKernel (recommended)

Notes

GMRES handles non-symmetric systems (rare in FEA)
CG is preferred for symmetric positive definite systems
Higher memory usage than CG due to Krylov subspace storage

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel, GMRES
>>> solver = GMRES(kernel=kernel, maxiter=1000, tol=1e-6)

__init__(kernel: StiffnessKernel, maxiter=1000, tol=1e-05, matrix_free=None)[source]

solve(rhs, rho=None, use_last=True)[source]

Solve linear system K(rho) @ U = rhs on GPU using GMRES.

Parameters:

rhs (cp.ndarray) – Right-hand side force vector on GPU, shape (n_dof,)
rho (cp.ndarray, optional) – Design variables (densities) on GPU, shape (n_elements,)
use_last (bool, optional) – Use previous solution as initial guess (warm-starting), default: True

Returns:

U (cp.ndarray) – Solution vector on GPU, shape (n_dof,)
residual (float) – Relative residual: ||K@U - rhs|| / ||rhs||

Raises:

ValueError – If rho is not provided and kernel.has_rho is False

Notes

All operations performed on GPU using CuPy
Warm-starting (use_last=True) accelerates convergence
Matrix-free mode uses kernel.dot() directly
Assembled mode constructs explicit CSR matrix first
GMRES handles non-symmetric systems (rare in FEA)
CG is preferred for symmetric positive definite systems

class pyFANTOM.solvers.CUDA._solvers.SPSOLVE(kernel: StiffnessKernel)[source]

Bases: Solver

CUDA sparse direct solver using CuPy’s spsolve.

GPU-accelerated direct sparse LU factorization solver. Fast for small to medium problems but memory-intensive. Limited to < 3M DOF due to GPU memory constraints.

Parameters:: kernel (CuStiffnessKernel) – CUDA stiffness kernel (must construct explicit matrix)
Raises:: ValueError – If kernel.shape[0] > 3e6 (3 million DOF limit)

Notes

Direct solver: No iterations, exact solution (within numerical precision)
High memory usage: Stores factorization on GPU
Use iterative solvers (CG, MultiGrid) for large problems
Automatically handles constrained DOF

Examples

>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel, SPSOLVE
>>> mesh = StructuredMesh2D(nx=64, ny=64, lx=1.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = SPSOLVE(kernel=kernel)  # OK for small problems

__init__(kernel: StiffnessKernel)[source]

solve(rhs, rho=None, **kwargs)[source]

Solve the linear system Kx = rhs using sparse direct solver.

Parameters:

rhs (cp.ndarray) – Right-hand side vector
rho (cp.ndarray, optional) – Design variable densities. If None, uses kernel’s stored rho
**kwargs – Additional keyword arguments (unused, for API compatibility)

Returns:

U (cp.ndarray) – Solution vector
residual (float) – Relative residual norm ||rhs - K@U|| / ||rhs||

Notes

Direct solver: No warm start needed
Constructs and factorizes K every call (expensive)
Consider iterative solvers for repeated solves in optimization

class pyFANTOM.solvers.CUDA._solvers.MultiGrid(mesh: CuStructuredMesh2D | CuStructuredMesh3D, kernel: StiffnessKernel, maxiter=1000, tol=1e-05, n_smooth=3, omega=0.5, n_level=3, cycle='W', w_level=1, coarse_solver='cholmod', matrix_free=False, low_level_tol=1e-08, low_level_maxiter=5000, min_omega=0.4, omega_boost=1.06)[source]

Bases: Solver

CUDA-accelerated Geometric Multigrid solver for structured meshes.

GPU implementation of geometric multigrid with Jacobi smoothing and coarse grid correction. Most memory-efficient solver for large 3D problems on GPU. Uses restriction/prolongation operators on nested grids for O(n) complexity per iteration.

Parameters:

mesh (StructuredMesh2D or StructuredMesh3D) – CUDA structured mesh (required for geometric coarsening)
kernel (CuStiffnessKernel) – CUDA stiffness kernel (must be StructuredStiffnessKernel)
maxiter (int, optional) – Maximum outer iterations (preconditioned CG) (default: 1000)
tol (float, optional) – Convergence tolerance for relative residual (default: 1e-5)
n_smooth (int, optional) – Number of Jacobi smoothing iterations per level (default: 3)
omega (float, optional) – Jacobi relaxation parameter (default: 0.5)
n_level (int, optional) – Number of multigrid levels (default: 3)
cycle (str, optional) – Multigrid cycle type: ‘V’ or ‘W’ (default: ‘W’)
w_level (int or list, optional) – Levels to apply W-cycle at (default: 1)
coarse_solver (str, optional) – Coarse grid solver: ‘cholmod’, ‘cg’, ‘gmres’, ‘splu’, ‘spsolve’ (default: ‘cholmod’)
matrix_free (bool, optional) – Whether to use matrix-free operators (default: False, not yet implemented)
low_level_tol (float, optional) – Tolerance for coarse grid iterative solver (default: 1e-8)
low_level_maxiter (int, optional) – Max iterations for coarse grid iterative solver (default: 5000)
min_omega (float, optional) – Minimum omega for adaptive relaxation (default: 0.4)
omega_boost (float, optional) – Omega boost factor per level (default: 1.06)

Notes

Only works with structured meshes (uniform grid coarsening)
Memory efficient: No explicit matrix storage
Preconditioned CG with multigrid preconditioner
Adaptive omega based on residual
CHOLMOD coarse solver requires CPU transfer (fast for small coarse grids)

Raises:: ValueError – If matrix_free=True with splu/spsolve coarse solver If coarse_solver not in [‘cg’, ‘gmres’, ‘splu’, ‘spsolve’, ‘cholmod’]

Examples

>>> from pyFANTOM.CUDA import StructuredMesh3D, StructuredStiffnessKernel, MultiGrid
>>> mesh = StructuredMesh3D(nx=128, ny=128, nz=128, lx=1.0, ly=1.0, lz=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = MultiGrid(mesh=mesh, kernel=kernel, n_level=4, cycle='W')

__init__(mesh: CuStructuredMesh2D | CuStructuredMesh3D, kernel: StiffnessKernel, maxiter=1000, tol=1e-05, n_smooth=3, omega=0.5, n_level=3, cycle='W', w_level=1, coarse_solver='cholmod', matrix_free=False, low_level_tol=1e-08, low_level_maxiter=5000, min_omega=0.4, omega_boost=1.06)[source]

reset()[source]: Reset cached multigrid hierarchy (forces rebuild on next solve).

solve(rhs, rho=None, use_last=True)[source]

Solve linear system K(rho) @ U = rhs on GPU using multigrid-preconditioned CG.

Parameters:

rhs (cp.ndarray) – Right-hand side force vector on GPU, shape (n_dof,)
rho (cp.ndarray, optional) – Design variables (densities) on GPU, shape (n_elements,)
use_last (bool, optional) – Use previous solution as initial guess (warm-starting), default: True

Returns:

U (cp.ndarray) – Solution vector on GPU, shape (n_dof,)
residual (float) – Relative residual: ||K@U - rhs|| / ||rhs||

Raises:

ValueError – If rho is not provided and kernel.has_rho is False

Notes

All operations performed on GPU using CuPy
First call builds multigrid hierarchy (expensive, ~1-5 seconds)
Subsequent calls reuse hierarchy (fast, ~0.1-1 second)
Uses preconditioned CG with multigrid V/W-cycle preconditioner
Adaptive omega adjusts relaxation based on residual reduction
Frees GPU memory after solve to prevent accumulation
Most memory-efficient solver for large 3D problems (>10M DOF)

class pyFANTOM.Optimizers.CUDA.MMA.MMA(problem: Problem, sub_tol=1e-07, sub_maxiter=100, change_tol=0.0001, fun_tol=1e-06, move=0.5, timer=False)[source]

Bases: Optimizer

CUDA-accelerated Method of Moving Asymptotes (MMA) optimizer.

GPU version of MMA using CuPy arrays. Identical API to CPU version but operates on GPU memory for maximum performance in large-scale topology optimization.

Parameters:

problem (Problem) – Optimization problem (e.g., MinimumCompliance) with CUDA backend
sub_tol (float, optional) – Sub-problem convergence tolerance (default: 1e-7)
sub_maxiter (int, optional) – Sub-problem max iterations (default: 100)
change_tol (float, optional) – Design variable change convergence tolerance (default: 1e-4)
fun_tol (float, optional) – Objective function change tolerance (default: 1e-6)
move (float, optional) – Move limit for design variables, range [0,1] (default: 0.5)
timer (bool, optional) – Return iteration time if True (default: False)

Notes

All arrays stored as CuPy arrays on GPU
5-10x faster than CPU for large problems (>1M DOF)
Requires CUDA-capable GPU and CuPy
Best for: General constraints, multi-material, multi-constraint problems

Examples

>>> from pyFANTOM.CUDA import MMA, MinimumCompliance
>>> optimizer = MMA(problem=problem, sub_tol=1e-7, fun_tol=1e-6)
>>> for i in range(100):
>>>     optimizer.iter()
>>>     if optimizer.converged():
>>>         break

__init__(problem: Problem, sub_tol=1e-07, sub_maxiter=100, change_tol=0.0001, fun_tol=1e-06, move=0.5, timer=False)[source]

Create an optimizer bound to a problem.

Parameters - problem: instance of Problem providing objective and gradients.

iter()[source]

Perform one MMA optimization iteration on GPU.

Updates design variables by solving a convex sub-problem approximation of the original problem. Uses moving asymptotes to control step size. All operations performed on GPU using CuPy.

Returns:: If timer=True, returns iteration time in seconds
Return type:: float, optional

Notes

Solves convex sub-problem using mmasub() on GPU
Updates moving asymptotes (low, upp) for next iteration
Tracks design variable change and objective change
Handles NaN values by clipping to bounds

converged(*args, **kwargs)[source]

Check if optimizer has converged.

Returns:: True if convergence criteria are met: - Problem penalty continuation is complete (is_terminal() == True) - Design variable change <= change_tol - Objective function change <= fun_tol
Return type:: bool

Notes

Convergence requires both design change and objective change to be below tolerances. Also checks that penalty continuation (if used) has finished.

logs()[source]

Return diagnostic information for current iteration.

Returns:: Dictionary with keys: - ‘objective’: Current objective function value - ‘variable change’: L2 norm of design variable change - ‘function change’: Relative objective function change - Additional keys from problem.logs() (e.g., ‘iteration’, ‘residual’)
Return type:: dict

Notes

Used for monitoring optimization progress and convergence.

class pyFANTOM.Optimizers.CUDA.OC.OC(problem: Problem, move: float = 0.2, change_tol=0.0001, fun_tol=1e-06, timer=False)[source]

Bases: Optimizer

CUDA-accelerated Optimality Criteria (OC) optimizer.

GPU version of OC using CuPy arrays. Fast heuristic optimizer specifically designed for topology optimization with independent volume constraints. Identical API to CPU version.

Parameters:

problem (Problem) – Optimization problem (must have independent constraints) with CUDA backend
move (float, optional) – Move limit for design variables (default: 0.2)
change_tol (float, optional) – Convergence tolerance for design change (default: 1e-4)
fun_tol (float, optional) – Convergence tolerance for objective change (default: 1e-6)
timer (bool, optional) – Return iteration time (default: False)

Notes

All arrays stored as CuPy arrays on GPU
2-3x faster than CPU per iteration
Best for: Classic minimum compliance with volume constraint
Limited scope: Only works with independent constraints
Requires CUDA-capable GPU and CuPy

Examples

>>> from pyFANTOM.CUDA import OC
>>> optimizer = OC(problem=problem, move=0.2)
>>> for i in range(200):
>>>     optimizer.iter()
>>>     if optimizer.converged():
>>>         break

__init__(problem: Problem, move: float = 0.2, change_tol=0.0001, fun_tol=1e-06, timer=False)[source]

Create an optimizer bound to a problem.

Parameters - problem: instance of Problem providing objective and gradients.

iter()[source]

Perform one OC optimization iteration on GPU.

Updates design variables using optimality criteria update formula with bisection to find Lagrange multiplier satisfying volume constraint. All operations performed on GPU using CuPy.

Returns:: If timer=True, returns iteration time in seconds
Return type:: float, optional

Notes

Computes OC parameter: ocP = x * sqrt(-df/dg)
Uses bisection to find Lagrange multiplier lambda
Updates: x_new = clip(xL, xU, ocP/lambda)
Enforces move limits: xL = x - move, xU = x + move

converged(*args, **kwargs)[source]

Check if optimizer has converged.

Returns:: True if convergence criteria are met: - Problem penalty continuation is complete (is_terminal() == True) - Design variable change <= change_tol - Objective function change <= fun_tol
Return type:: bool

Notes

Convergence requires both design change and objective change to be below tolerances. Also checks that penalty continuation (if used) has finished.

logs()[source]

Return diagnostic information for current iteration.

Returns:: Dictionary with keys: - ‘objective’: Current objective function value - ‘variable change’: L2 norm of design variable change - ‘function change’: Relative objective function change - Additional keys from problem.logs() (e.g., ‘iteration’, ‘residual’)
Return type:: dict

Notes

Used for monitoring optimization progress and convergence.

class pyFANTOM.Optimizers.CUDA.PGD.PGD(problem: Problem, change_tol=0.0001, fun_tol=1e-06, maxiter_N=50, tol_B=1e-08, tol_N=1e-06, C=1000000000000.0, fall_back_move=0.2, alpha_max=100.0, relaxation=1.0, warmup_iter=50, timer=False, conjugate_directions=True)[source]

Bases: Optimizer

CUDA-accelerated Projected Gradient Descent optimizer.

GPU version of PGD using CuPy arrays. Fast first-order optimizer using BFGS-based step size adaptation and projection onto feasible set. Identical API to CPU version.

Parameters:

problem (Problem) – Optimization problem with CUDA backend
change_tol (float, optional) – Design variable change tolerance (default: 1e-4)
fun_tol (float, optional) – Objective function change tolerance (default: 1e-6)
maxiter_N (int, optional) – Max iterations for Newton’s method in projection (default: 50)
tol_B (float, optional) – Bisection tolerance (default: 1e-8)
tol_N (float, optional) – Newton solver tolerance (default: 1e-6)
C (float, optional) – Large constant for penalty (default: 1e12)
fall_back_move (float, optional) – Fallback step size when BFGS fails (default: 0.2)
alpha_max (float, optional) – Maximum step size (default: 1e2)
relaxation (float, optional) – Step size relaxation parameter (default: 1.0)
warmup_iter (int, optional) – Iterations before enforcing strict feasibility (default: 50)
timer (bool, optional) – Return iteration time (default: False)
conjugate_directions (bool, optional) – Use conjugate gradient directions (default: True)

Notes

All arrays stored as CuPy arrays on GPU
Faster than CPU: ~30-50% faster per iteration
Best for: Simple bounds, single constraint problems
Less robust: May struggle with multiple nonlinear constraints
Requires CUDA-capable GPU and CuPy

Examples

>>> from pyFANTOM.CUDA import PGD
>>> optimizer = PGD(problem=problem, conjugate_directions=True)
>>> for i in range(100):
>>>     optimizer.iter()
>>>     if optimizer.converged():
>>>         break

__init__(problem: Problem, change_tol=0.0001, fun_tol=1e-06, maxiter_N=50, tol_B=1e-08, tol_N=1e-06, C=1000000000000.0, fall_back_move=0.2, alpha_max=100.0, relaxation=1.0, warmup_iter=50, timer=False, conjugate_directions=True)[source]

Create an optimizer bound to a problem.

Parameters - problem: instance of Problem providing objective and gradients.

alpha(desvars, df)[source]

project_to_feasible(desvars_new, desvars, dg)[source]

iter()[source]

Perform one PGD optimization iteration on GPU.

Updates design variables using projected gradient descent with adaptive step size (BFGS-based) and projection onto feasible set. All operations performed on GPU using CuPy.

Returns:: If timer=True, returns iteration time in seconds
Return type:: float, optional

Notes

Computes step size using BFGS approximation (alpha)
Uses conjugate gradient directions if enabled
Projects update onto feasible set (satisfies constraints)
For independent constraints: uses bisection
For coupled constraints: uses Newton’s method with Wolfe line search

converged(*args, **kwargs)[source]

Check if optimizer has converged.

Returns:: True if convergence criteria are met: - Problem penalty continuation is complete (is_terminal() == True) - Design variable change <= change_tol - Objective function change <= fun_tol
Return type:: bool

Notes

Convergence requires both design change and objective change to be below tolerances. Also checks that penalty continuation (if used) has finished.

logs()[source]

Return diagnostic information for current iteration.

Returns:: Dictionary with keys: - ‘objective’: Current objective function value - ‘variable change’: L2 norm of design variable change - ‘function change’: Relative objective function change - Additional keys from problem.logs() (e.g., ‘iteration’, ‘residual’)
Return type:: dict

Notes

Used for monitoring optimization progress and convergence.

wolfe_line_search(l, delta, dg, desvars_new, c1=0.0001, c2=0.9, max_iter=10)[source]: Wolfe conditions line search for Newton step

class pyFANTOM.Problem.CUDA.MinimumCompliance.MinimumCompliance(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Bases: Problem

CUDA-accelerated minimum compliance topology optimization problem.

GPU version of MinimumCompliance using CuPy arrays. Identical API to CPU version but operates on GPU memory for maximum performance in large-scale optimization.

Parameters:

FE (FiniteElement) – CUDA finite element analysis engine with mesh, kernel, and solver
filter (StructuredFilter2D, StructuredFilter3D, or GeneralFilter) – CUDA density filter for ensuring minimum feature sizes
E_mul (list of float, optional) – Young’s modulus multipliers for each material (default: [1.0] for single material)
void (float, optional) – Minimum density to avoid singularity (default: 1e-6)
penalty (float, optional) – SIMP penalization exponent (default: 3.0). Higher = more binary designs
volume_fraction (list of float, optional) – Volume fraction constraint for each material (default: [0.25])
penalty_schedule (callable, optional) – Function(p, iteration) for penalty continuation. If None, uses constant penalty
heavyside (bool, optional) – Apply Heaviside projection for sharper 0-1 designs (default: True)
beta (float or callable, optional) – Heaviside projection sharpness parameter. Can be a float (default: 2) or a callable function of iteration: beta(iteration) -> float. Enables beta continuation for gradual Heaviside sharpening during optimization.
eta (float, optional) – Heaviside projection threshold (default: 0.5)

Notes

All arrays stored as CuPy arrays on GPU
5-10x faster than CPU for large problems (>1M DOF)
Requires CUDA-capable GPU and CuPy
Identical SIMP formulation and Heaviside projection as CPU version

Examples

>>> from pyFANTOM.CUDA import *
>>> mesh = StructuredMesh2D(nx=256, ny=128, lx=2.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = CG(kernel=kernel)
>>> FE = FiniteElement(mesh=mesh, kernel=kernel, solver=solver)
>>> filter = StructuredFilter2D(mesh=mesh, r_min=1.5)
>>> problem = MinimumCompliance(FE=FE, filter=filter, penalty=3.0, volume_fraction=[0.3])

__init__(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Initialize problem state.

Subclasses may accept finite-element handlers, filters, and other configuration parameters.

N()[source]

Return the number of design variables.

Returns:: Total number of design variables (n_elements * n_materials)
Return type:: int

m()[source]

Return the number of constraints.

Returns:: Number of volume constraints (1 for single material, n_materials for multi-material)
Return type:: int

is_independent()[source]

Check if constraints are independent (required for OC optimizer).

Returns:: True if constraints are independent (always True for MinimumCompliance)
Return type:: bool

constraint_map()[source]

Return mapping of constraints to design variables.

Returns:

Single material: 1 (scalar)
Multi-material: array of shape (n_materials, n_vars) with 1s indicating which design variables belong to each material constraint

Return type:

int or cp.ndarray

Notes

Used by optimizers to identify which design variables affect each constraint. For multi-material, constraint i affects variables [i*n_elements:(i+1)*n_elements].

bounds()[source]

Return bounds for design variables.

Returns:: (lower_bound, upper_bound) where both are 0.0 and 1.0 respectively
Return type:: tuple

Notes

Design variables represent normalized densities in [0, 1].

visualize_problem(**kwargs)[source]

Visualize problem setup (mesh, BCs, loads).

Parameters:: **kwargs – Arguments passed to FE.visualize_problem()
Returns:: Plot object
Return type:: matplotlib.axes.Axes or k3d.Plot

Notes

GPU arrays are automatically transferred to CPU for visualization.

visualize_solution(**kwargs)[source]

Visualize optimized design (density distribution).

Parameters:: **kwargs – Arguments passed to FE.visualize_density()
Returns:: Plot object
Return type:: matplotlib.axes.Axes or k3d.Plot

Notes

Shows current design variables as density field. For multi-material problems, displays combined material distribution. GPU arrays transferred to CPU for visualization.

init_desvars()[source]

Initialize design variables to uniform density at volume fraction.

Sets all design variables to the volume fraction constraint value and performs initial FEA solve to compute objective and constraints.

Notes

Single material: all variables set to volume_fraction
Multi-material: all variables set to min(volume_fraction) for each material
Resets iteration counter to 0
Triggers _compute() to evaluate objective and constraints
All operations performed on GPU

set_desvars(desvars: cupy.ndarray)[source]

Set design variables and recompute objective/constraints.

Parameters:: desvars (cp.ndarray) – Design variables on GPU, shape (n_vars,). Values should be in [0, 1]
Raises:: ValueError – If desvars shape doesn’t match num_vars

Notes

Triggers FEA solve and sensitivity computation on GPU
Increments iteration counter
Updates cached values: _f, _g, _nabla_f, _nabla_g

get_desvars()[source]

Get current design variables.

Returns:: Current design variables on GPU, shape (n_vars,)
Return type:: cp.ndarray

Notes

Returns raw (unfiltered) design variables. For filtered densities, access via problem.filter.dot(problem.get_desvars()).

penalize(rho: cupy.ndarray)[source]

penalize_grad(rho: cupy.ndarray)[source]

f(rho: cupy.ndarray | None = None)[source]

Compute objective function value (compliance).

Parameters:: rho (cp.ndarray, optional) – Design variables on GPU for linearization. If None, returns cached value
Returns:: Compliance value (F^T @ U). Lower is better (stiffer structure).
Return type:: float

Notes

If rho provided: returns linearized approximation f(x) + df/dx @ (rho - x)
If rho is None: returns cached value from last set_desvars() call
Compliance = F^T @ U = U^T @ K @ U (strain energy)
All operations performed on GPU

nabla_f(rho: cupy.ndarray | None = None)[source]

Compute objective function gradient (compliance sensitivities).

Parameters:: rho (cp.ndarray, optional) – Unused (for interface compatibility)
Returns:: Gradient of compliance w.r.t. design variables on GPU, shape (n_vars,)
Return type:: cp.ndarray

Notes

Uses adjoint method: dC/drho = -U^T @ dK/drho @ U
Includes filter adjoint: sens_raw = H^T @ sens_filtered
Negative gradient means increasing density reduces compliance (good)
All operations performed on GPU

g(rho=None)[source]

Compute constraint values (volume fraction violations).

Parameters:: rho (cp.ndarray, optional) – Design variables on GPU. If None, uses current desvars
Returns:: Constraint violations, shape (n_constraints,). Negative = satisfied. g[i] = (volume_fraction[i] - actual_volume_fraction[i])
Return type:: cp.ndarray

Notes

Constraint satisfied when g <= 0
For single material: returns scalar
For multi-material: returns array with one constraint per material
All operations performed on GPU

nabla_g()[source]

Compute constraint gradients (volume fraction sensitivities).

Returns:: Gradient of constraints w.r.t. design variables on GPU. - Single material: shape (n_vars,) - Multi-material: shape (n_materials, n_vars)
Return type:: cp.ndarray

Notes

Each row is d(volume_fraction[i])/drho
For uniform elements, gradient is constant: 1/volume_total per element
Used by optimizers to enforce volume constraints

ill_conditioned()[source]

Check if FEA system is ill-conditioned.

Returns:: True if residual >= 1e-2 (indicates poor solver convergence)
Return type:: bool

Notes

Residual > 1e-2 suggests numerical issues (check void parameter, penalty)
May indicate near-singular stiffness matrix (too many void elements)
Consider increasing void parameter or reducing penalty

is_terminal()[source]

Check if penalty continuation has reached final value.

Returns:: True if penalty schedule is complete or not used
Return type:: bool

Notes

Used by optimizers to determine if continuation is finished
If penalty_schedule is None, always returns True
If penalty_schedule exists, checks if current iteration has reached final penalty

logs()[source]

Return diagnostic information for current iteration.

Returns:: Dictionary with keys: - ‘iteration’: Current iteration number - ‘residual’: FEA solver residual (||K@U - F|| / ||F||)
Return type:: dict

Notes

Used by optimizers to track convergence and diagnose issues.

FEA(thresshold: bool = True)[source]

class pyFANTOM.Problem.CUDA.ComplianceConstrainedMinimumVolume.ComplianceConstrainedMinimumVolume(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E: float = 1.0, void: float = 1e-06, penalty: float = 3.0, compliance_limit: float = 0.25, penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Bases: Problem

Minimum volume topology optimization with compliance constraint (CUDA).

GPU-accelerated version of ComplianceConstrainedMinimumVolume. Inverse formulation of MinimumCompliance: minimize material usage while maintaining structure stiffness above threshold. All computations performed on GPU using CuPy.

Parameters:

FE (FiniteElement) – CUDA finite element analysis engine
filter (StructuredFilter2D, StructuredFilter3D, or GeneralFilter) – CUDA density filter
E (float, optional) – Young’s modulus (default: 1.0)
void (float, optional) – Minimum density to avoid singularity (default: 1e-6)
penalty (float, optional) – SIMP penalization exponent (default: 3.0)
compliance_limit (float, optional) – Maximum allowable compliance (normalized by initial compliance) (default: 0.25)
penalty_schedule (callable, optional) – Penalty continuation function(p, iteration)
heavyside (bool, optional) – Apply Heaviside projection (default: True)
beta (float or callable, optional) – Heaviside projection sharpness parameter. Can be a float (default: 2) or a callable function of iteration: beta(iteration) -> float. Enables beta continuation for gradual Heaviside sharpening during optimization.
eta (float, optional) – Heaviside threshold (default: 0.5)

Notes

Objective: Minimize volume
Constraint: Compliance ≤ compliance_limit * initial_compliance
Use case: Lightweight design, material cost minimization
Comparison: More challenging than MinimumCompliance (compliance constraint is nonlinear)
Typical compliance_limit: 0.2-0.5 of initial design
GPU Acceleration: All arrays stored as CuPy arrays on GPU
Performance: 5-10x faster than CPU for large problems

Examples

>>> from pyFANTOM.CUDA import *
>>> problem = ComplianceConstrainedMinimumVolume(
>>>     FE=FE, filter=filter,
>>>     compliance_limit=0.3,  # 30% of initial compliance
>>>     penalty=3.0
>>> )

__init__(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E: float = 1.0, void: float = 1e-06, penalty: float = 3.0, compliance_limit: float = 0.25, penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Initialize problem state.

Subclasses may accept finite-element handlers, filters, and other configuration parameters.

N()[source]

Return the number of design variables.

Returns:: Total number of design variables (n_elements)
Return type:: int

m()[source]

Return the number of constraints.

Returns:: Number of constraints (1: compliance constraint)
Return type:: int

is_independent()[source]

Check if constraints are independent (required for OC optimizer).

Returns:: True if constraints are independent (always True for ComplianceConstrainedMinimumVolume)
Return type:: bool

constraint_map()[source]

Return mapping of constraints to design variables.

Returns:: Scalar value 1 indicating all design variables affect the compliance constraint
Return type:: int

Notes

Used by optimizers to identify which design variables affect the constraint. For this problem, all variables affect compliance.

bounds()[source]

Return bounds for design variables.

Returns:: (lower_bound, upper_bound) where both are 0.0 and 1.0 respectively
Return type:: tuple

Notes

Design variables represent normalized densities in [0, 1].

visualize_problem(**kwargs)[source]

Visualize problem setup (mesh, BCs, loads).

Parameters:: **kwargs – Arguments passed to FE.visualize_problem()
Returns:: Plot object
Return type:: matplotlib.axes.Axes or k3d.Plot

visualize_solution(**kwargs)[source]

Visualize optimized design (density distribution).

Parameters:: **kwargs – Arguments passed to FE.visualize_density()
Returns:: Plot object
Return type:: matplotlib.axes.Axes or k3d.Plot

Notes

Shows current design variables as density field.

init_desvars()[source]

Initialize design variables to full density (volume = 1.0).

Sets all design variables to 1.0 (full material) and performs initial FEA solve to compute objective and constraints. This ensures initial design satisfies compliance constraint.

Notes

All variables set to 1.0 (full material)
Resets iteration counter to 0
Triggers _compute() to evaluate objective and constraints

set_desvars(desvars: cupy.ndarray)[source]

Set design variables and recompute objective/constraints.

Parameters:: desvars (cupy.ndarray) – Design variables, shape (n_vars,). Values should be in [0, 1]
Raises:: ValueError – If desvars shape doesn’t match num_vars

Notes

Triggers FEA solve and sensitivity computation
Increments iteration counter
Updates cached values: _f, _g, _nabla_f, _nabla_g

get_desvars()[source]

Get current design variables.

Returns:: Current design variables, shape (n_vars,)
Return type:: cupy.ndarray

Notes

Returns raw (unfiltered) design variables. For filtered densities, access via problem.filter.dot(problem.get_desvars()).

penalize(rho: cupy.ndarray)[source]

Apply SIMP penalization and Heaviside projection.

Parameters:: rho (cupy.ndarray) – Filtered densities, shape (n_elements,)
Returns:: Penalized Young’s moduli, shape (n_elements,)
Return type:: cupy.ndarray

Notes

Applies Heaviside projection if enabled
Applies SIMP: E = E0 * rho^penalty
Clips to void value to avoid singularity
Uses penalty_schedule if provided
Uses beta_schedule if beta is callable

penalize_grad(rho: cupy.ndarray)[source]

Compute gradient of penalization function.

Parameters:: rho (cupy.ndarray) – Filtered densities, shape (n_elements,)
Returns:: Gradient dE/drho, shape (n_elements,)
Return type:: cupy.ndarray

Notes

Derivative of SIMP + Heaviside projection
Used in chain rule for sensitivity analysis
Accounts for penalty_schedule if provided
Accounts for beta_schedule if beta is callable

f(rho: cupy.ndarray | None = None)[source]

Compute objective function value (volume).

Parameters:: rho (cupy.ndarray, optional) – Design variables for linearization. If None, returns cached value
Returns:: Volume fraction (normalized by total domain volume). Lower is better.
Return type:: float

Notes

If rho provided: returns linearized approximation f(x) + df/dx @ (rho - x)
If rho is None: returns cached value from last set_desvars() call
Objective is to minimize material usage while satisfying compliance constraint

nabla_f(rho: cupy.ndarray | None = None)[source]

Compute objective function gradient (volume sensitivities).

Parameters:: rho (cupy.ndarray, optional) – Unused (for interface compatibility)
Returns:: Gradient of volume w.r.t. design variables, shape (n_vars,)
Return type:: cupy.ndarray

Notes

Constant gradient: 1/volume_total per element
All elements contribute equally to volume
Used by optimizers to minimize material usage

g(rho=None)[source]

Compute constraint values (compliance constraint violations).

Parameters:: rho (cupy.ndarray, optional) – Design variables. If None, uses current desvars
Returns:: Normalized compliance constraint violation. Negative = satisfied. g = (compliance - compliance_limit) / compliance_limit
Return type:: float

Notes

Constraint satisfied when g <= 0
Normalized by compliance_limit for better scaling
If rho provided, uses linearized approximation

nabla_g()[source]

Compute constraint gradients (compliance sensitivities).

Returns:: Gradient of compliance constraint w.r.t. design variables, shape (n_vars,)
Return type:: cupy.ndarray

Notes

Normalized by compliance_limit for better scaling
Uses adjoint method: dC/drho = -U^T @ dK/drho @ U
Includes filter adjoint: sens_raw = H^T @ sens_filtered
Used by optimizers to enforce compliance constraint

ill_conditioned()[source]

Check if FEA system is ill-conditioned.

Returns:: True if residual >= 1e-2 (indicates poor solver convergence)
Return type:: bool

Notes

Residual > 1e-2 suggests numerical issues (check void parameter, penalty)
May indicate near-singular stiffness matrix (too many void elements)
Consider increasing void parameter or reducing penalty

is_terminal()[source]

Check if penalty continuation has reached final value.

Returns:: True if penalty schedule is complete or not used
Return type:: bool

Notes

Used by optimizers to determine if continuation is finished
If penalty_schedule is None, always returns True
If penalty_schedule exists, checks if current iteration has reached final penalty

logs()[source]

Return diagnostic information for current iteration.

Returns:: Dictionary with keys: - ‘iteration’: Current iteration number - ‘residual’: FEA solver residual (||K@U - F|| / ||F||)
Return type:: dict

Notes

Used by optimizers to track convergence and diagnose issues.

class pyFANTOM.Problem.CUDA.WeightDistributionMinimumCompliance.WeightDistributionMinimumCompliance(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, target_centroid: List[float], maximum_distance: float, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Bases: Problem

Minimum compliance with weight distribution constraint (CUDA).

GPU-accelerated version of WeightDistributionMinimumCompliance. Extends MinimumCompliance with an additional constraint on the centroid location of the material distribution. Useful for balancing structures or controlling center of mass for dynamic applications. All computations performed on GPU using CuPy.

Parameters:

FE (FiniteElement) – CUDA finite element analysis engine with mesh, kernel, and solver
filter (StructuredFilter2D, StructuredFilter3D, or GeneralFilter) – CUDA density filter for ensuring minimum feature sizes
target_centroid (List[float]) – Target centroid location in physical coordinates, shape (2,) for 2D or (3,) for 3D
maximum_distance (float) – Maximum allowed distance from target centroid (constraint: ||centroid - target|| <= maximum_distance)
E_mul (list of float, optional) – Young’s modulus multipliers for each material (default: [1.0] for single material)
void (float, optional) – Minimum density to avoid singularity (default: 1e-6)
penalty (float, optional) – SIMP penalization exponent (default: 3.0). Higher = more binary designs
volume_fraction (list of float, optional) – Volume fraction constraint for each material (default: [0.25])
penalty_schedule (callable, optional) – Function(p, iteration) for penalty continuation. If None, uses constant penalty
heavyside (bool, optional) – Apply Heaviside projection for sharper 0-1 designs (default: True)
beta (float or callable, optional) – Heaviside projection sharpness parameter. Can be a float (default: 2) or a callable function of iteration: beta(iteration) -> float. Enables beta continuation for gradual Heaviside sharpening during optimization.
eta (float, optional) – Heaviside projection threshold (default: 0.5)

is_single_material

True for single material, False for multi-material optimization

Type:: bool

n_material

Number of materials

Type:: int

iteration

Current iteration count

Type:: int

target_centroid

Target centroid location

Type:: cupy.ndarray

maximum_distance

Maximum allowed distance from target

Type:: float

f()[source]: Compute compliance objective

nabla_f()[source]: Compute compliance sensitivities

g()[source]: Compute constraint values (volume + centroid distance)

nabla_g()[source]: Compute constraint gradients

get_desvars()[source]: Get current design variables

set_desvars(rho)[source]: Set design variables and trigger FEA solve

visualize_solution(**kwargs)[source]: Plot optimized design

Notes

Constraints: - Volume constraint: Same as MinimumCompliance - Centroid constraint: ||centroid(rho) - target_centroid|| <= maximum_distance - Constraints are coupled (not independent), so OC optimizer cannot be used

Centroid Computation: - Centroid = sum(rho[i] * volume[i] * centroid[i]) / sum(rho[i] * volume[i]) - Weighted by element density and volume

Use Cases: - Balancing structures for dynamic loads - Controlling center of mass - Symmetric designs with off-center loading - Multi-material designs with weight distribution requirements

GPU Acceleration: - All arrays stored as CuPy arrays on GPU - 5-10x faster than CPU for large problems

Examples

>>> from pyFANTOM.CUDA import *
>>> # Setup FEA
>>> mesh = StructuredMesh2D(nx=128, ny=64, lx=2.0, ly=1.0)
>>> kernel = StructuredStiffnessKernel(mesh=mesh)
>>> solver = CG(kernel=kernel)
>>> FE = FiniteElement(mesh=mesh, kernel=kernel, solver=solver)
>>>
>>> # Apply BCs and loads
>>> FE.add_dirichlet_boundary_condition(node_ids=fixed_nodes, rhs=0)
>>> FE.add_point_forces(node_ids=load_nodes, forces=cp.array([[0, -1.0]]))
>>>
>>> # Define optimization problem with centroid constraint
>>> filter = StructuredFilter2D(mesh=mesh, r_min=1.5)
>>> problem = WeightDistributionMinimumCompliance(
>>>     FE=FE, filter=filter,
>>>     target_centroid=[1.0, 0.5],  # Center of 2D domain
>>>     maximum_distance=0.1,  # 10% of domain size
>>>     penalty=3.0, volume_fraction=[0.3]
>>> )

__init__(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, target_centroid: List[float], maximum_distance: float, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]

Initialize problem state.

Subclasses may accept finite-element handlers, filters, and other configuration parameters.

N()[source]

Return the number of design variables.

Returns:: Total number of design variables (n_elements * n_materials)
Return type:: int

m()[source]

Return the number of constraints.

Returns:: Number of constraints: - Single material: 2 (volume + centroid distance) - Multi-material: n_materials + 1 (volume per material + centroid distance)
Return type:: int

is_independent()[source]

Check if constraints are independent (required for OC optimizer).

Returns:: Always False. Centroid constraint couples all design variables, making constraints non-independent. Use MMA or PGD optimizer instead of OC.
Return type:: bool

constraint_map()[source]

Return mapping of constraints to design variables.

Returns:

Single material: 1 (scalar, all variables affect both constraints)
Multi-material: array of shape (n_materials + 1, n_vars) where: - Rows 0 to n_materials-1: volume constraints per material - Last row: centroid constraint (all variables = 1)

Return type:

int or ndarray

Notes

Used by optimizers to identify which design variables affect each constraint. The centroid constraint depends on all design variables, making constraints coupled.

bounds()[source]

Return bounds for design variables.

Returns:: (lower_bound, upper_bound) where both are 0.0 and 1.0 respectively
Return type:: tuple

Notes

Design variables represent normalized densities in [0, 1].

visualize_problem(**kwargs)[source]

Visualize problem setup (mesh, BCs, loads).

Parameters:: **kwargs – Arguments passed to FE.visualize_problem()
Returns:: Plot object
Return type:: matplotlib.axes.Axes or k3d.Plot

visualize_solution(**kwargs)[source]

Visualize optimized design (density distribution).

Parameters:: **kwargs – Arguments passed to FE.visualize_density()
Returns:: Plot object
Return type:: matplotlib.axes.Axes or k3d.Plot

Notes

Shows current design variables as density field. For multi-material problems, displays combined material distribution.

init_desvars()[source]

Initialize design variables with random smoothed distribution.

Uses random initialization followed by triple filtering to create a smooth initial guess. This helps avoid getting stuck in poor local minima that may violate the centroid constraint.

Notes

Random initialization: uniform distribution in [min(0, 2*vf-1), 1.0]
Triple filtering: applies filter 3 times for smoothness
Normalized to match volume fraction constraint
Resets iteration counter to 0
Triggers _compute() to evaluate objective and constraints

set_desvars(desvars: cupy.ndarray)[source]

Set design variables and recompute objective/constraints.

Parameters:: desvars (cupy.ndarray) – Design variables, shape (n_vars,). Values should be in [0, 1]
Raises:: ValueError – If desvars shape doesn’t match num_vars

Notes

Triggers FEA solve and sensitivity computation
Increments iteration counter
Updates cached values: _f, _g, _nabla_f, _nabla_g
Computes centroid and centroid constraint gradient

get_desvars()[source]

Get current design variables.

Returns:: Current design variables, shape (n_vars,)
Return type:: cupy.ndarray

Notes

Returns raw (unfiltered) design variables. For filtered densities, access via problem.filter.dot(problem.get_desvars()).

penalize(rho: cupy.ndarray)[source]

Apply SIMP penalization and Heaviside projection.

Parameters:: rho (cupy.ndarray) – Filtered densities, shape (n_elements,) or (n_elements, n_materials)
Returns:: Penalized Young’s moduli, shape (n_elements,)
Return type:: cupy.ndarray

Notes

Applies Heaviside projection if enabled
Applies SIMP: E = E0 * rho^penalty
Clips to void value to avoid singularity
Uses penalty_schedule if provided
Uses beta_schedule if beta is callable
For multi-material: uses material interpolation scheme

penalize_grad(rho: cupy.ndarray)[source]

Compute gradient of penalization function.

Parameters:: rho (cupy.ndarray) – Filtered densities, shape (n_elements,) or (n_elements, n_materials)
Returns:: Gradient dE/drho, shape (n_elements,) or (n_elements, n_materials)
Return type:: cupy.ndarray

Notes

Derivative of SIMP + Heaviside projection
Used in chain rule for sensitivity analysis
Accounts for penalty_schedule if provided
Accounts for beta_schedule if beta is callable
For multi-material: returns gradient per material

f(rho: cupy.ndarray | None = None)[source]

Compute objective function value (compliance).

Parameters:: rho (cupy.ndarray, optional) – Design variables for linearization. If None, returns cached value
Returns:: Compliance value (F^T @ U). Lower is better (stiffer structure).
Return type:: float

Notes

If rho provided: returns linearized approximation f(x) + df/dx @ (rho - x)
If rho is None: returns cached value from last set_desvars() call
Compliance = F^T @ U = U^T @ K @ U (strain energy)

nabla_f(rho: cupy.ndarray | None = None)[source]

Compute objective function gradient (compliance sensitivities).

Parameters:: rho (cupy.ndarray, optional) – Unused (for interface compatibility)
Returns:: Gradient of compliance w.r.t. design variables, shape (n_vars,)
Return type:: cupy.ndarray

Notes

Uses adjoint method: dC/drho = -U^T @ dK/drho @ U
Includes filter adjoint: sens_raw = H^T @ sens_filtered
Negative gradient means increasing density reduces compliance (good)

g(rho=None)[source]

Compute constraint values (volume + centroid distance violations).

Parameters:: rho (cupy.ndarray, optional) – Design variables. If None, uses current desvars
Returns:: Constraint violations, shape (n_constraints,). Negative = satisfied. - First n_materials constraints: volume fraction violations - Last constraint: centroid distance violation (distance^2 - maximum_distance^2)
Return type:: cupy.ndarray

Notes

Constraint satisfied when g <= 0
For single material: returns [volume_violation, centroid_violation]
For multi-material: returns [vol_viol_1, …, vol_viol_n, centroid_violation]
If rho provided, uses linearized approximation

nabla_g()[source]

Compute constraint gradients (volume + centroid sensitivities).

Returns:: Gradient of constraints w.r.t. design variables. - Single material: shape (2, n_vars) - [volume_grad, centroid_grad] - Multi-material: shape (n_materials + 1, n_vars) - [vol_grad_1, …, vol_grad_n, centroid_grad]
Return type:: cupy.ndarray

Notes

First rows: d(volume_fraction[i])/drho (constant for uniform elements)
Last row: d(centroid_distance^2)/drho (depends on element positions)
Centroid gradient accounts for weighted average of element centroids
Used by optimizers to enforce constraints

ill_conditioned()[source]

Check if FEA system is ill-conditioned.

Returns:: True if residual >= 1e-2 (indicates poor solver convergence)
Return type:: bool

Notes

Residual > 1e-2 suggests numerical issues (check void parameter, penalty)
May indicate near-singular stiffness matrix (too many void elements)
Consider increasing void parameter or reducing penalty

is_terminal()[source]

Check if penalty continuation has reached final value.

Returns:: True if penalty schedule is complete or not used
Return type:: bool

Notes

Used by optimizers to determine if continuation is finished
If penalty_schedule is None, always returns True
If penalty_schedule exists, checks if current iteration has reached final penalty

logs()[source]

Return diagnostic information for current iteration.

Returns:: Dictionary with keys: - ‘iteration’: Current iteration number - ‘residual’: FEA solver residual (||K@U - F|| / ||F||)
Return type:: dict

Notes

Used by optimizers to track convergence and diagnose issues.