CUDA Backend
The CUDA backend provides GPU-accelerated implementations of all core functionality for topology optimization.
CUDA backend public API.
Importing from this module gives access to GPU/CUDA implementations of core pyFANTOM components. The API mirrors the CPU backend so users can switch backends by changing the import path.
>>> from pyFANTOM.CUDA import StructuredMesh2D, FiniteElement, MinimumCompliance
Mesh Classes
pyFANTOM.geom.CUDA._mesh.CuStructuredMesh2D- CUDA 2D structured meshpyFANTOM.geom.CUDA._mesh.CuStructuredMesh3D- CUDA 3D structured meshpyFANTOM.geom.CUDA._mesh.CuGeneralMesh- CUDA general unstructured mesh
Filter Classes
pyFANTOM.geom.CUDA._filters.CuStructuredFilter2D- CUDA 2D structured density filterpyFANTOM.geom.CUDA._filters.CuStructuredFilter3D- CUDA 3D structured density filterpyFANTOM.geom.CUDA._filters.CuGeneralFilter- CUDA general density filter
Finite Element Classes
pyFANTOM.FiniteElement.CUDA.FiniteElement.FiniteElement- CUDA finite element analysis engine
Stiffness Kernel Classes
pyFANTOM.stiffness.CUDA._FEA.StructuredStiffnessKernel- CUDA structured mesh stiffness kernelpyFANTOM.stiffness.CUDA._FEA.GeneralStiffnessKernel- CUDA general mesh stiffness kernelpyFANTOM.stiffness.CUDA._FEA.UniformStiffnessKernel- CUDA uniform element stiffness kernel
Solver Classes
pyFANTOM.solvers.CUDA._solvers.CG- CUDA Conjugate Gradient iterative solverpyFANTOM.solvers.CUDA._solvers.GMRES- CUDA GMRES iterative solverpyFANTOM.solvers.CUDA._solvers.SPSOLVE- CUDA sparse direct solverpyFANTOM.solvers.CUDA._solvers.MultiGrid- CUDA multigrid iterative solver
Optimizer Classes
pyFANTOM.Optimizers.CUDA.MMA.MMA- CUDA Method of Moving Asymptotes optimizerpyFANTOM.Optimizers.CUDA.OC.OC- CUDA Optimality Criteria optimizerpyFANTOM.Optimizers.CUDA.PGD.PGD- CUDA Projected Gradient Descent optimizer
Problem Classes
pyFANTOM.Problem.CUDA.MinimumCompliance.MinimumCompliance- CUDA minimum compliance problempyFANTOM.Problem.CUDA.ComplianceConstrainedMinimumVolume.ComplianceConstrainedMinimumVolume- CUDA volume-constrained compliance problempyFANTOM.Problem.CUDA.WeightDistributionMinimumCompliance.WeightDistributionMinimumCompliance- CUDA weight distribution problem
Detailed Documentation
- class pyFANTOM.geom.CUDA._mesh.CuStructuredMesh2D(nx, ny, lx, ly, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]
Bases:
StructuredMeshCUDA-accelerated 2D structured mesh with uniform rectangular elements.
GPU version of StructuredMesh2D using CuPy arrays for all data storage. Identical API to CPU version but operates on GPU memory for efficient assembly and computation.
- Parameters:
nx (int) – Number of elements in x-direction
ny (int) – Number of elements in y-direction
lx (float) – Physical length of domain in x-direction
ly (float) – Physical length of domain in y-direction
dtype (np.dtype, optional) – Data type for arrays (default: np.float64)
physics (Physx, optional) – Physics model defining material behavior (default: LinearElasticity(E=1.0, nu=1/3))
Notes
All arrays are stored as CuPy arrays on GPU
Element stiffness matrix is computed on CPU then transferred to GPU
Use with CuStructuredStiffnessKernel for GPU-accelerated assembly
Requires CUDA-capable GPU and CuPy installation
Examples
>>> from pyFANTOM.CUDA import StructuredMesh2D >>> from pyFANTOM import LinearElasticity >>> mesh = StructuredMesh2D(nx=128, ny=64, lx=2.0, ly=1.0) >>> print(f"GPU memory usage: {mesh.nodes.nbytes + mesh.elements.nbytes} bytes")
- class pyFANTOM.geom.CUDA._mesh.CuStructuredMesh3D(nx, ny, nz, lx, ly, lz, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]
Bases:
StructuredMeshCUDA-accelerated 3D structured mesh with uniform hexahedral elements.
GPU version of StructuredMesh3D using CuPy arrays for all data storage. Identical API to CPU version but operates on GPU memory for efficient 3D topology optimization.
- Parameters:
nx (int) – Number of elements in x-direction
ny (int) – Number of elements in y-direction
nz (int) – Number of elements in z-direction
lx (float) – Physical length of domain in x-direction
ly (float) – Physical length of domain in y-direction
lz (float) – Physical length of domain in z-direction
dtype (np.dtype, optional) – Data type for arrays (default: np.float64)
physics (Physx, optional) – Physics model defining material behavior (default: LinearElasticity(E=1.0, nu=1/3))
Notes
All arrays are stored as CuPy arrays on GPU
Use MultiGrid solver for large 3D problems to manage GPU memory
Element stiffness matrix computed on CPU then transferred
Requires sufficient GPU memory for 3D problems (typically >8GB for 100^3 elements)
Examples
>>> from pyFANTOM.CUDA import StructuredMesh3D >>> mesh = StructuredMesh3D(nx=64, ny=64, nz=32, lx=1.0, ly=1.0, lz=0.5) >>> print(f"Total DOF: {len(mesh.nodes) * mesh.dof}")
- class pyFANTOM.geom.CUDA._mesh.CuGeneralMesh(nodes, elements, dtype=<class 'numpy.float64'>, physics: ~pyFANTOM.physics._physx.Physx = <pyFANTOM.physics.LinearElasticity.LinearElasticity object>)[source]
Bases:
GeneralMeshCUDA-accelerated general unstructured mesh.
GPU version of GeneralMesh. Inherits CPU mesh initialization logic then transfers all arrays to GPU memory. Supports both uniform and heterogeneous element topologies.
- Parameters:
nodes (ndarray) – Node coordinates (CPU array), shape (n_nodes, spatial_dim)
elements (list or ndarray) – Element connectivity (CPU array or list)
dtype (np.dtype, optional) – Data type for arrays (default: np.float64)
physics (Physx, optional) – Physics model (default: LinearElasticity(E=1.0, nu=1/3))
Notes
Mesh processing (node cleanup, stiffness computation) happens on CPU
Resulting arrays are transferred to GPU after initialization
Use CuGeneralStiffnessKernel for GPU-accelerated assembly
For large unstructured meshes, GPU offers significant speedup over CPU
Examples
>>> from pyFANTOM.CUDA import GeneralMesh >>> import numpy as np >>> nodes = np.array([[0,0], [1,0], [0,1], [1,1]]) >>> elements = np.array([[0,1,2], [1,3,2]]) >>> mesh = GeneralMesh(nodes, elements)
- class pyFANTOM.geom.CUDA._filters.CuStructuredFilter2D(mesh: CuStructuredMesh2D, r_min)[source]
Bases:
FilterKernelCUDA-accelerated 2D density filter for structured meshes.
GPU version of StructuredFilter2D using CuPy. Fast convolution-based filtering on GPU. Identical API to CPU version but operates on GPU memory for maximum performance.
- Parameters:
mesh (CuStructuredMesh2D) – CUDA 2D structured mesh
r_min (float) – Filter radius in element units
- weights
Precomputed filter weights on GPU
- Type:
cupy.ndarray
- offsets
Element index offsets for neighbors on GPU
- Type:
cupy.ndarray
- normalizer
Per-element normalization for adjoint on GPU
- Type:
cupy.ndarray
- dot(rho)
Forward filter application
Notes
Uses cone filter: w(d) = (r_min - d) / r_min
Handles non-square elements (dx != dy) automatically
Typical r_min values: 1.5-3.0 element units
Larger r_min produces smoother, coarser designs
All arrays stored on GPU as CuPy arrays
GPU-accelerated filtering for maximum performance
Requires CUDA-capable GPU and CuPy
Examples
>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredFilter2D >>> mesh = StructuredMesh2D(nx=128, ny=64, lx=2.0, ly=1.0) >>> filter = StructuredFilter2D(mesh=mesh, r_min=2.0) >>> rho_smooth = filter.dot(rho)
- __init__(mesh: CuStructuredMesh2D, r_min)[source]
- class pyFANTOM.geom.CUDA._filters.CuStructuredFilter3D(mesh: CuStructuredMesh3D, r_min)[source]
Bases:
FilterKernelCUDA-accelerated 3D density filter for structured meshes.
GPU version of StructuredFilter3D using CuPy. All filtering operations on GPU for maximum performance in 3D topology optimization. Identical API to CPU version but operates on GPU memory.
- Parameters:
mesh (CuStructuredMesh3D) – CUDA 3D structured mesh
r_min (float) – Filter radius in element units (e.g., r_min=1.5 includes elements within 1.5 element widths)
- weights
Precomputed filter weights for neighbor elements on GPU
- Type:
cupy.ndarray
- offsets
Element index offsets for neighbors within filter radius on GPU
- Type:
cupy.ndarray
- normalizer
Per-element normalization factors for adjoint operation on GPU
- Type:
cupy.ndarray
- dot(rho)
Apply forward filter: filtered_rho = H @ rho
Notes
Filter uses cone-shaped kernel: w(d) = (r_min - d) / r_min for d < r_min
Automatically handles anisotropic elements (dx != dy != dz)
Adjoint includes proper normalization for optimization gradients
Memory-efficient: stores only unique weights and offsets, not full matrix
All arrays stored on GPU as CuPy arrays
5-10x faster than CPU for large 3D problems
Essential for large-scale 3D optimization
Requires CUDA-capable GPU and CuPy
Examples
>>> from pyFANTOM.CUDA import StructuredMesh3D, StructuredFilter3D >>> mesh = StructuredMesh3D(nx=64, ny=64, nz=32, lx=1.0, ly=1.0, lz=0.5) >>> filter = StructuredFilter3D(mesh=mesh, r_min=1.5) >>> rho_filtered = filter.dot(rho_raw)
- __init__(mesh: CuStructuredMesh3D, r_min)[source]
- class pyFANTOM.geom.CUDA._filters.CuGeneralFilter(mesh: CuGeneralMesh, r_min)[source]
Bases:
FilterKernelCUDA-accelerated density filter for unstructured meshes.
GPU version of GeneralFilter using CuPy sparse matrices. Implements density filtering for general unstructured meshes using spatial search. Stores explicit sparse filter matrix.
- Parameters:
mesh (CuGeneralMesh) – CUDA-accelerated general unstructured mesh
r_min (float) – Filter radius in physical units (not element units)
- kernel
Explicit filter matrix stored on GPU, shape (n_elements, n_elements)
- Type:
cupyx.scipy.sparse.csr_matrix
- dot(rho)
Forward filter: H @ rho
Notes
Uses spatial search (KDTree on CPU) to build filter, then transfers to GPU
Stores explicit sparse matrix (memory-intensive for large meshes)
r_min is in physical units, not element-relative
Automatically detects 2D vs 3D from mesh
GPU acceleration provides 3-5x speedup over CPU for large meshes
Examples
>>> from pyFANTOM.CUDA import GeneralMesh, GeneralFilter >>> mesh = GeneralMesh(nodes, elements) # Triangular mesh >>> filter = GeneralFilter(mesh=mesh, r_min=0.05) # Physical radius >>> rho_filtered = filter.dot(rho)
- __init__(mesh: CuGeneralMesh, r_min)[source]
- class pyFANTOM.FiniteElement.CUDA.FiniteElement.FiniteElement(mesh: CuStructuredMesh2D | CuStructuredMesh3D | CuGeneralMesh, kernel: StructuredStiffnessKernel | GeneralStiffnessKernel | UniformStiffnessKernel, solver: CG | GMRES | SPSOLVE | MultiGrid)[source]
Bases:
FiniteElementCUDA-accelerated finite element analysis engine.
GPU version of FiniteElement using CuPy for all computations. Provides identical API to CPU version but with 5-10x performance improvement for large problems. All arrays stored on GPU memory.
- Parameters:
mesh (StructuredMesh2D, StructuredMesh3D, or GeneralMesh) – CUDA mesh defining geometry and physics
kernel (StructuredStiffnessKernel, GeneralStiffnessKernel, or UniformStiffnessKernel) – CUDA stiffness assembly kernel
solver (CG, GMRES, SPSOLVE, or MultiGrid) – CUDA linear system solver
- mesh
Associated CUDA mesh
- Type:
Mesh
- kernel
Stiffness assembly kernel
- Type:
StiffnessKernel
- solver
Linear solver
- Type:
Solver
- rhs
Right-hand side force vector on GPU, shape (n_nodes * dof,)
- Type:
cupy.ndarray
- d_rhs
Prescribed displacement values for Dirichlet BCs, shape (n_nodes * dof,)
- Type:
cupy.ndarray
Notes
All arrays stored as CuPy arrays on GPU
KDTree searches performed on CPU (temporary transfer)
Visualization methods transfer data to CPU automatically
Requires CUDA-capable GPU and CuPy
Examples
>>> from pyFANTOM.CUDA import * >>> mesh = StructuredMesh2D(nx=256, ny=128, lx=2.0, ly=1.0) >>> kernel = StructuredStiffnessKernel(mesh=mesh) >>> solver = CG(kernel=kernel) >>> FE = FiniteElement(mesh=mesh, kernel=kernel, solver=solver) >>> >>> # Apply BCs and loads >>> FE.add_dirichlet_boundary_condition(node_ids=fixed_nodes, rhs=0) >>> FE.add_point_forces(node_ids=load_nodes, forces=cp.array([[0, -1.0]])) >>> >>> # Solve >>> U, residual = FE.solve(rho=cp.ones(nel) * 0.5)
- __init__(mesh: CuStructuredMesh2D | CuStructuredMesh3D | CuGeneralMesh, kernel: StructuredStiffnessKernel | GeneralStiffnessKernel | UniformStiffnessKernel, solver: CG | GMRES | SPSOLVE | MultiGrid)[source]
Initialize CUDA finite element analysis engine.
- Parameters:
mesh (StructuredMesh2D, StructuredMesh3D, or GeneralMesh) – CUDA mesh defining geometry and physics
kernel (StructuredStiffnessKernel, GeneralStiffnessKernel, or UniformStiffnessKernel) – CUDA stiffness assembly kernel
solver (CG, GMRES, SPSOLVE, or MultiGrid) – CUDA linear system solver
Notes
Initializes force vector (rhs) and Dirichlet BC storage (d_rhs) on GPU. All arrays remain on GPU throughout computation.
- add_dirichlet_boundary_condition(node_ids: cupy.ndarray | None = None, positions: cupy.ndarray | None = None, dofs: cupy.ndarray | None = None, rhs: float | cupy.ndarray = 0.0)[source]
Apply Dirichlet (fixed displacement) boundary conditions.
- Parameters:
node_ids (cp.ndarray, optional) – Node indices to constrain on GPU, shape (n_nodes,)
positions (cp.ndarray, optional) – Physical coordinates to constrain (uses KDTree search on CPU), shape (n_nodes, spatial_dim)
dofs (cp.ndarray, optional) – DOF mask per node, shape (n_nodes, dof) with 1=constrained, 0=free. If None, all DOFs at specified nodes are constrained
rhs (float or cp.ndarray, optional) – Prescribed displacement values. Scalar for zero displacement, or array shape (n_nodes, dof)
Notes
Provide either node_ids OR positions, not both
dofs array: [[1,0]] constrains only x-displacement in 2D
Multiple calls accumulate constraints
Modifies kernel.constraints boolean array on GPU
KDTree search performed on CPU (nodes transferred temporarily)
- add_neumann_boundary_condition(**kwargs)[source]
Apply Neumann boundary conditions (not implemented).
- Raises:
NotImplementedError – Always raised. Neumann boundary conditions not yet implemented for CUDA FEA.
- add_point_forces(forces: cupy.ndarray, node_ids: cupy.ndarray | None = None, positions: cupy.ndarray | None = None)[source]
Apply point loads to specified nodes.
- Parameters:
forces (cp.ndarray) – Force vectors on GPU, shape (n_forces, dof) where dof is 2 for 2D or 3 for 3D. For 2D: [[fx, fy]], for 3D: [[fx, fy, fz]]
node_ids (cp.ndarray, optional) – Node indices for force application on GPU, shape (n_forces,)
positions (cp.ndarray, optional) – Physical coordinates for force application (uses KDTree search on CPU)
Notes
Provide either node_ids OR positions
Multiple calls accumulate forces
Forces are automatically set to prescribed values at Dirichlet nodes
Units should match physics model (e.g., Newtons for SI units)
KDTree search performed on CPU (nodes transferred temporarily)
- reset_forces()[source]
Clear all applied forces.
Sets the right-hand side force vector to zero on GPU. Useful when reconfiguring loading conditions or starting a new analysis.
- reset_dirichlet_boundary_conditions()[source]
Remove all Dirichlet boundary conditions.
Clears all fixed displacement constraints and resets the kernel’s constraint state. Useful when reconfiguring boundary conditions.
- visualize_problem(ax=None, **kwargs)[source]
Visualize mesh with boundary conditions and loads.
- Parameters:
ax (matplotlib.axes.Axes, optional) – Existing axes to plot on (2D only). If None, creates new figure
**kwargs – Additional arguments passed to plot_problem_2D/3D
- Returns:
Plot object (2D: matplotlib axes, 3D: k3d plot)
- Return type:
matplotlib.axes.Axes or k3d.Plot
Notes
GPU arrays are transferred to CPU for visualization (.get() calls)
Shows mesh elements, fixed nodes, and applied forces
- visualize_density(rho, ax=None, **kwargs)[source]
Visualize density distribution (optimization result).
- Parameters:
rho (cp.ndarray or ndarray) – Element densities on GPU or CPU, shape (n_elements,) or (n_elements, n_materials)
ax (matplotlib.axes.Axes, optional) – Existing axes to plot on (2D only). If None, creates new figure
**kwargs – Additional arguments passed to plot_problem_2D/3D
- Returns:
Plot object (2D: matplotlib axes, 3D: k3d plot)
- Return type:
matplotlib.axes.Axes or k3d.Plot
Notes
GPU arrays are automatically transferred to CPU for visualization
Color-codes elements by density value (0=void, 1=solid)
- visualize_field(field, ax=None, rho=None, **kwargs)[source]
Visualize scalar or vector field (displacement, stress, strain, etc.).
- Parameters:
field (cp.ndarray or ndarray) – Field values to plot on GPU or CPU. Shape depends on field type: - Scalar: (n_elements,) or (n_nodes,) - Vector: (n_nodes, dof) for displacement-like fields
ax (matplotlib.axes.Axes, optional) – Existing axes to plot on (2D only). If None, creates new figure
rho (cp.ndarray or ndarray, optional) – Density mask to hide void regions, shape (n_elements,)
**kwargs – Additional arguments passed to plot_field_2D/3D
- Returns:
Plot object (2D: matplotlib axes, 3D: k3d plot)
- Return type:
matplotlib.axes.Axes or k3d.Plot
Notes
GPU arrays are automatically transferred to CPU for visualization
Common fields: displacement, Von Mises stress, strain energy
If rho provided, elements with rho < 0.5 are hidden
- solve(rho=None)[source]
Solve finite element system: K(rho) @ U = F on GPU.
- Parameters:
rho (cp.ndarray, optional) – Element density variables on GPU, shape (n_elements,). If None, uses rho=1 (full density)
- Returns:
U (cp.ndarray) – Displacement vector on GPU, shape (n_nodes * dof,). Interleaved DOFs: [ux0, uy0, ux1, uy1, …]
residual (float) – Normalized residual ||K@U - F|| / ||F|| for solution validation
Notes
residual < 1e-5 indicates accurate solution
residual > 1e-2 indicates ill-conditioned system
All operations performed on GPU for maximum performance
Solver reuses factorization if available
For topology optimization, rho represents material distribution
- class pyFANTOM.stiffness.CUDA._FEA.StructuredStiffnessKernel(mesh: CuStructuredMesh2D | CuStructuredMesh3D)[source]
Bases:
StiffnessKernelCUDA-accelerated stiffness kernel for structured meshes.
GPU version of CPU StructuredStiffnessKernel using CuPy. Uses node-basis assembly with a single precomputed element stiffness matrix for maximum efficiency on uniform grid meshes. Identical API to CPU version but operates entirely on GPU memory for maximum performance.
- Parameters:
mesh (CuStructuredMesh2D or CuStructuredMesh3D) – CUDA-accelerated structured mesh with uniform elements
- K_single
Single element stiffness matrix used for all elements (on GPU)
- Type:
cupy.ndarray
- non_con_map
Index map to non-constrained DOFs for reduced system (on GPU)
- Type:
cupy.ndarray
- set_rho(rho)
Set design variables for subsequent matrix-free operations
Notes
Fastest kernel for structured meshes (2-3x faster than UniformStiffnessKernel on GPU)
All arrays stored as CuPy arrays on GPU
5-10x faster than CPU for large 3D problems
Automatically handles constrained DOFs by zeroing rows/columns
Matrix-free dot() operation uses optimized CUDA kernels
Use construct() only when explicit matrix is needed (e.g., for direct solvers)
Requires CUDA-capable GPU and CuPy
Use with CUDA solvers (CG, MultiGrid)
Memory limited by GPU VRAM
Examples
>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel >>> mesh = StructuredMesh2D(nx=256, ny=128, lx=2.0, ly=1.0) >>> kernel = StructuredStiffnessKernel(mesh=mesh) >>> kernel.set_rho(rho) >>> u = kernel.dot(f) # Matrix-free K @ f on GPU
- __init__(mesh: CuStructuredMesh2D | CuStructuredMesh3D)[source]
Initialize base stiffness kernel.
Sets up attributes for density storage and matrix operation aliases. Subclasses should call super().__init__() and set mesh-specific attributes.
- diagonal(rho=None)[source]
Extract diagonal of assembled stiffness matrix.
- Parameters:
rho (cp.ndarray, optional) – Design densities. If None, uses stored rho
- Returns:
diag – Diagonal entries of K(rho)
- Return type:
cp.ndarray
Notes
Used for Jacobi preconditioning in iterative solvers
Cached on subsequent calls with same rho
- set_constraints(constraints)[source]
Set Dirichlet boundary conditions (replaces existing constraints).
- Parameters:
constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)
Notes
Clears all existing constraints and sets only the specified DOFs. Use add_constraints() to add constraints incrementally.
- add_constraints(constraints)[source]
Add Dirichlet boundary conditions (accumulates with existing constraints).
- Parameters:
constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)
Notes
Adds to existing constraints. Use set_constraints() to replace all constraints.
- process_grad(U)[source]
Compute element-wise compliance sensitivities dC/drho.
- Parameters:
U (cp.ndarray) – Displacement vector from FEA solve on GPU, shape (n_dof,)
- Returns:
dk – Element sensitivities on GPU (compliance derivative per element), shape (n_elements,)
- Return type:
cp.ndarray
Notes
Computes: dk[e] = -U[e]^T @ K_e @ U[e]
Used for compliance minimization gradient
Negative sign convention for minimization
All operations performed on GPU
- construct(rho)[source]
Build explicit CSR sparse matrix representation on GPU.
- Parameters:
rho (cp.ndarray) – Design variables on GPU, shape (n_elements,)
- Returns:
Sparse stiffness matrix in CSR format on GPU
- Return type:
cupyx.scipy.sparse.csr_matrix
Notes
Memory-intensive: O(nnz) storage on GPU
Required for direct solvers (SPSOLVE)
Reuses sparsity pattern on subsequent calls (faster)
- dot(rhs)[source]
Matrix-free matrix-vector or matrix-matrix product K(rho) @ rhs.
- Parameters:
rhs (cp.ndarray or cp.sparse.csr_matrix) – Right-hand side vector or matrix
- Returns:
result – Product K(rho) @ rhs
- Return type:
cp.ndarray or cp.sparse.csr_matrix
- Raises:
ValueError – If rho not set or shape mismatch
NotImplementedError – If rhs type not supported
Notes
Requires rho set via set_rho() first
Matrix-free: No explicit matrix storage
Faster than constructing full matrix for large problems
- class pyFANTOM.stiffness.CUDA._FEA.GeneralStiffnessKernel(mesh: CuGeneralMesh)[source]
Bases:
StiffnessKernelCUDA-accelerated stiffness kernel for heterogeneous meshes.
GPU version of CPU GeneralStiffnessKernel. Most general kernel supporting meshes with elements of varying sizes and topologies. Uses flat storage with pointer arrays for memory-efficient representation. Identical API to CPU version but operates entirely on GPU.
- Parameters:
mesh (CuGeneralMesh) – CUDA general mesh with potentially heterogeneous elements (mesh.is_uniform can be False)
- K_flat
Flattened element stiffness matrices (on GPU)
- Type:
cupy.ndarray
- K_ptr
Pointer array for accessing K_flat: K_e = K_flat[K_ptr[e]:K_ptr[e+1]] (on GPU)
- Type:
cupy.ndarray
- elements_flat
Flattened element connectivity (on GPU)
- Type:
cupy.ndarray
- elements_ptr
Pointer array for accessing elements (on GPU)
- Type:
cupy.ndarray
- element_sizes
Number of nodes per element, shape (n_elements,) (on GPU)
- Type:
cupy.ndarray
- Same interface as StructuredStiffnessKernel (set_rho, dot, construct, diagonal, etc.)
Notes
Supports mixed element types (triangles + quads, tets + hexes, etc.)
Flat storage with pointers enables variable element sizes
Slower than Uniform/Structured kernels due to indirection
Automatically used if mesh.is_uniform == False
All arrays stored as CuPy arrays on GPU
All operations on GPU using CuPy
Use with CUDA solvers (CG, MultiGrid) for best performance
Memory limited by GPU VRAM
Examples
>>> from pyFANTOM.CUDA import GeneralMesh, GeneralStiffnessKernel >>> import numpy as np >>> # Mixed triangular and quad mesh >>> nodes = np.array([[0,0], [1,0], [0,1], [1,1], [2,0]]) >>> elements = [np.array([0,1,2]), np.array([1,4,3,2])] # Triangle + Quad >>> mesh = GeneralMesh(nodes, elements) >>> kernel = GeneralStiffnessKernel(mesh=mesh)
- __init__(mesh: CuGeneralMesh)[source]
Initialize base stiffness kernel.
Sets up attributes for density storage and matrix operation aliases. Subclasses should call super().__init__() and set mesh-specific attributes.
- diagonal(rho=None)[source]
Compute diagonal entries of stiffness matrix.
- Parameters:
rho (cupy.ndarray, optional) – Element density variables, shape (n_elements,). If None, uses cached self.rho
- Returns:
Diagonal entries of K(rho), shape (n_dof,)
- Return type:
cupy.ndarray
Notes
Uses GPU-accelerated flat storage assembly
Respects Dirichlet boundary conditions (constrained DOFs zeroed)
Caches result in self.diag for reuse
Used for preconditioning and diagnostics
- set_constraints(constraints)[source]
Set Dirichlet boundary condition DOF (replaces existing constraints).
- Parameters:
constraints (cp.ndarray or array-like) – Indices of constrained DOF
Notes
Resets all previous constraints
Use add_constraints() to append to existing constraints
Constrained rows/columns are zeroed in matrix operations
- add_constraints(constraints)[source]
Add Dirichlet boundary conditions (accumulates with existing constraints).
- Parameters:
constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)
Notes
Adds to existing constraints. Use set_constraints() to replace all constraints. - Use set_constraints() to replace all constraints
- process_grad(U)[source]
Compute element-wise compliance sensitivities dC/drho.
- Parameters:
U (cp.ndarray) – Displacement vector from FEA solve
- Returns:
dk – Element sensitivities (compliance derivative per element)
- Return type:
cp.ndarray
Notes
Computes: dk[e] = -U[e]^T @ K_e @ U[e]
Used for compliance minimization gradient
Negative sign convention for minimization
- construct(rho)[source]
Construct CSR stiffness matrix representation.
- Parameters:
rho (cupy.ndarray) – Element density variables, shape (n_elements,)
- Returns:
Sparse CSR stiffness matrix K(rho), shape (n_dof, n_dof)
- Return type:
cupy.sparse.csr_matrix
Notes
Assembles full sparse matrix structure (first call) or updates values (subsequent)
Uses matrix-matrix product with identity to extract sparsity pattern
Caches sparsity pattern (ptr) for efficient updates
Sets has_been_constructed flag after first call
More memory-intensive than matrix-free operations but needed for some solvers
- dot(rhs)[source]
Matrix-vector or matrix-matrix product K @ rhs.
- Parameters:
rhs (cupy.ndarray or cupy.sparse.csr_matrix) – Right-hand side vector or matrix - Vector: shape (n_dof,) - Matrix: shape (n_dof, n_cols) or sparse CSR matrix
- Returns:
Result K @ rhs, same type and shape as rhs (except columns)
- Return type:
cupy.ndarray or cupy.sparse.csr_matrix
- Raises:
ValueError – If rho not set or rhs shape mismatch
NotImplementedError – If rhs is not cupy array or sparse matrix
Notes
Uses cached self.rho if available
Automatically detects vector vs matrix input
Converts sparse matrices to CSR format if needed
Main interface for matrix operations
- class pyFANTOM.stiffness.CUDA._FEA.UniformStiffnessKernel(mesh: CuGeneralMesh)[source]
Bases:
StiffnessKernelCUDA-accelerated stiffness kernel for uniform unstructured meshes.
GPU version of CPU UniformStiffnessKernel. Handles unstructured meshes where all elements have the same number of nodes and identical geometry, using element-specific stiffness matrices. More general than StructuredStiffnessKernel but less efficient. Identical API to CPU version but operates entirely on GPU memory.
- Parameters:
mesh (CuGeneralMesh) – CUDA general mesh with uniform element topology (mesh.is_uniform must be True)
- Ks
Element stiffness matrices, shape (n_elements, dof_per_element, dof_per_element) (on GPU)
- Type:
cupy.ndarray
- Same interface as StructuredStiffnessKernel (set_rho, dot, construct, diagonal, etc.)
Notes
Requires mesh.is_uniform == True (all elements same size)
Stores per-element stiffness matrices (more memory than StructuredStiffnessKernel)
Use for unstructured uniform meshes (e.g., triangular or tetrahedral)
For structured grids, prefer StructuredStiffnessKernel for better performance
All arrays stored as CuPy arrays on GPU
All operations on GPU using CuPy
Use with CUDA solvers (CG, MultiGrid) for best performance
Memory limited by GPU VRAM
Examples
>>> from pyFANTOM.CUDA import GeneralMesh, UniformStiffnessKernel >>> import numpy as np >>> # Triangular mesh >>> nodes = np.array([[0,0], [1,0], [0,1], [1,1]]) >>> elements = np.array([[0,1,2], [1,3,2]]) >>> mesh = GeneralMesh(nodes, elements) >>> kernel = UniformStiffnessKernel(mesh=mesh)
- __init__(mesh: CuGeneralMesh)[source]
Initialize base stiffness kernel.
Sets up attributes for density storage and matrix operation aliases. Subclasses should call super().__init__() and set mesh-specific attributes.
- diagonal(rho=None)[source]
Extract diagonal of stiffness matrix.
- Parameters:
rho (cp.ndarray, optional) – Design variables on GPU. If None, uses self.rho (must be set via set_rho())
- Returns:
Diagonal entries on GPU, shape (n_dof,)
- Return type:
cp.ndarray
- Raises:
ValueError – If rho is None and has_rho is False
Notes
Useful for Jacobi preconditioning or diagonal scaling in iterative solvers.
- set_constraints(constraints)[source]
Set Dirichlet boundary conditions (replaces existing constraints).
- Parameters:
constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)
Notes
Clears all existing constraints and sets only the specified DOFs. Use add_constraints() to add constraints incrementally.
- add_constraints(constraints)[source]
Add Dirichlet boundary conditions (accumulates with existing constraints).
- Parameters:
constraints (cp.ndarray) – DOF indices to constrain on GPU, shape (n_constrained_dofs,)
Notes
Adds to existing constraints. Use set_constraints() to replace all constraints.
- process_grad(U)[source]
Compute element-wise compliance sensitivities dC/drho.
- Parameters:
U (cp.ndarray) – Displacement vector from FEA solve on GPU, shape (n_dof,)
- Returns:
dk – Element sensitivities on GPU, shape (n_elements,)
- Return type:
cp.ndarray
Notes
Computes -U^T @ dK/drho @ U for each element, used in compliance sensitivity analysis. All operations performed on GPU.
- construct(rho)[source]
Build explicit CSR sparse matrix representation on GPU.
- Parameters:
rho (cp.ndarray) – Design variables on GPU, shape (n_elements,)
- Returns:
Sparse stiffness matrix in CSR format on GPU
- Return type:
cupyx.scipy.sparse.csr_matrix
Notes
Memory-intensive: O(nnz) storage on GPU
Required for direct solvers (SPSOLVE)
Reuses sparsity pattern on subsequent calls (faster)
- dot(rhs)[source]
Matrix-vector or matrix-matrix product K @ rhs.
- Parameters:
rhs (cupy.ndarray or cupy.sparse.csr_matrix) – Right-hand side vector or matrix - Vector: shape (n_dof,) - Matrix: shape (n_dof, n_cols) or sparse CSR matrix
- Returns:
Result K @ rhs, same type and shape as rhs (except columns)
- Return type:
cupy.ndarray or cupy.sparse.csr_matrix
- Raises:
ValueError – If rho not set or rhs shape mismatch
NotImplementedError – If rhs is not cupy array or sparse matrix
Notes
Uses cached self.rho if available
Automatically detects vector vs matrix input
Converts sparse matrices to CSR format if needed
Main interface for matrix operations
- class pyFANTOM.solvers.CUDA._solvers.CG(kernel: StiffnessKernel, maxiter=1000, tol=1e-05, matrix_free=None)[source]
Bases:
SolverCUDA-accelerated Conjugate Gradient iterative solver.
GPU version of the CG solver using CuPy. Efficiently solves large sparse symmetric positive definite linear systems Kx=b on GPU with optional matrix-free operation.
- Parameters:
kernel (CuStiffnessKernel) – CUDA stiffness kernel (matrix operator)
maxiter (int, optional) – Maximum number of CG iterations (default: 1000)
tol (float, optional) – Convergence tolerance for relative residual (default: 1e-5)
matrix_free (bool, optional) – Whether to use matrix-free mode. If None, automatically determined: - True for StructuredStiffnessKernel (recommended) - False for General/UniformStiffnessKernel (faster)
Notes
Matrix-free mode: Lower memory, slower per-iteration
Assembled matrix mode: Higher memory, faster per-iteration
For structured meshes, matrix-free is preferred
For general meshes, assembled matrix is faster
Examples
>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel, CG >>> mesh = StructuredMesh2D(nx=256, ny=256, lx=1.0, ly=1.0) >>> kernel = StructuredStiffnessKernel(mesh=mesh) >>> solver = CG(kernel=kernel, maxiter=1000, tol=1e-6)
- solve(rhs, rho=None, use_last=True)[source]
Solve linear system K(rho) @ U = rhs on GPU.
- Parameters:
rhs (cp.ndarray) – Right-hand side force vector on GPU, shape (n_dof,)
rho (cp.ndarray, optional) – Design variables (densities) on GPU, shape (n_elements,)
use_last (bool, optional) – Use previous solution as initial guess (warm-starting), default: True
- Returns:
U (cp.ndarray) – Solution vector on GPU, shape (n_dof,)
residual (float) – Relative residual: ||K@U - rhs|| / ||rhs||
- Raises:
ValueError – If rho is not provided and kernel.has_rho is False
Notes
All operations performed on GPU using CuPy
Warm-starting (use_last=True) accelerates convergence
Matrix-free mode uses kernel.dot() directly
Assembled mode constructs explicit CSR matrix first
- class pyFANTOM.solvers.CUDA._solvers.GMRES(kernel: StiffnessKernel, maxiter=1000, tol=1e-05, matrix_free=None)[source]
Bases:
SolverCUDA-accelerated Generalized Minimal Residual iterative solver.
GPU version of GMRES solver using CuPy. Solves non-symmetric linear systems Kx=b on GPU. More general than CG but typically slower and more memory-intensive.
- Parameters:
kernel (CuStiffnessKernel) – CUDA stiffness kernel (matrix operator)
maxiter (int, optional) – Maximum number of GMRES iterations (default: 1000)
tol (float, optional) – Convergence tolerance for relative residual (default: 1e-5)
matrix_free (bool, optional) – Whether to use matrix-free mode. If None, automatically determined: - True for StructuredStiffnessKernel - False for General/UniformStiffnessKernel (recommended)
Notes
GMRES handles non-symmetric systems (rare in FEA)
CG is preferred for symmetric positive definite systems
Higher memory usage than CG due to Krylov subspace storage
Examples
>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel, GMRES >>> solver = GMRES(kernel=kernel, maxiter=1000, tol=1e-6)
- solve(rhs, rho=None, use_last=True)[source]
Solve linear system K(rho) @ U = rhs on GPU using GMRES.
- Parameters:
rhs (cp.ndarray) – Right-hand side force vector on GPU, shape (n_dof,)
rho (cp.ndarray, optional) – Design variables (densities) on GPU, shape (n_elements,)
use_last (bool, optional) – Use previous solution as initial guess (warm-starting), default: True
- Returns:
U (cp.ndarray) – Solution vector on GPU, shape (n_dof,)
residual (float) – Relative residual: ||K@U - rhs|| / ||rhs||
- Raises:
ValueError – If rho is not provided and kernel.has_rho is False
Notes
All operations performed on GPU using CuPy
Warm-starting (use_last=True) accelerates convergence
Matrix-free mode uses kernel.dot() directly
Assembled mode constructs explicit CSR matrix first
GMRES handles non-symmetric systems (rare in FEA)
CG is preferred for symmetric positive definite systems
- class pyFANTOM.solvers.CUDA._solvers.SPSOLVE(kernel: StiffnessKernel)[source]
Bases:
SolverCUDA sparse direct solver using CuPy’s spsolve.
GPU-accelerated direct sparse LU factorization solver. Fast for small to medium problems but memory-intensive. Limited to < 3M DOF due to GPU memory constraints.
- Parameters:
kernel (CuStiffnessKernel) – CUDA stiffness kernel (must construct explicit matrix)
- Raises:
ValueError – If kernel.shape[0] > 3e6 (3 million DOF limit)
Notes
Direct solver: No iterations, exact solution (within numerical precision)
High memory usage: Stores factorization on GPU
Use iterative solvers (CG, MultiGrid) for large problems
Automatically handles constrained DOF
Examples
>>> from pyFANTOM.CUDA import StructuredMesh2D, StructuredStiffnessKernel, SPSOLVE >>> mesh = StructuredMesh2D(nx=64, ny=64, lx=1.0, ly=1.0) >>> kernel = StructuredStiffnessKernel(mesh=mesh) >>> solver = SPSOLVE(kernel=kernel) # OK for small problems
- solve(rhs, rho=None, **kwargs)[source]
Solve the linear system Kx = rhs using sparse direct solver.
- Parameters:
rhs (cp.ndarray) – Right-hand side vector
rho (cp.ndarray, optional) – Design variable densities. If None, uses kernel’s stored rho
**kwargs – Additional keyword arguments (unused, for API compatibility)
- Returns:
U (cp.ndarray) – Solution vector
residual (float) – Relative residual norm ||rhs - K@U|| / ||rhs||
Notes
Direct solver: No warm start needed
Constructs and factorizes K every call (expensive)
Consider iterative solvers for repeated solves in optimization
- class pyFANTOM.solvers.CUDA._solvers.MultiGrid(mesh: CuStructuredMesh2D | CuStructuredMesh3D, kernel: StiffnessKernel, maxiter=1000, tol=1e-05, n_smooth=3, omega=0.5, n_level=3, cycle='W', w_level=1, coarse_solver='cholmod', matrix_free=False, low_level_tol=1e-08, low_level_maxiter=5000, min_omega=0.4, omega_boost=1.06)[source]
Bases:
SolverCUDA-accelerated Geometric Multigrid solver for structured meshes.
GPU implementation of geometric multigrid with Jacobi smoothing and coarse grid correction. Most memory-efficient solver for large 3D problems on GPU. Uses restriction/prolongation operators on nested grids for O(n) complexity per iteration.
- Parameters:
mesh (StructuredMesh2D or StructuredMesh3D) – CUDA structured mesh (required for geometric coarsening)
kernel (CuStiffnessKernel) – CUDA stiffness kernel (must be StructuredStiffnessKernel)
maxiter (int, optional) – Maximum outer iterations (preconditioned CG) (default: 1000)
tol (float, optional) – Convergence tolerance for relative residual (default: 1e-5)
n_smooth (int, optional) – Number of Jacobi smoothing iterations per level (default: 3)
omega (float, optional) – Jacobi relaxation parameter (default: 0.5)
n_level (int, optional) – Number of multigrid levels (default: 3)
cycle (str, optional) – Multigrid cycle type: ‘V’ or ‘W’ (default: ‘W’)
w_level (int or list, optional) – Levels to apply W-cycle at (default: 1)
coarse_solver (str, optional) – Coarse grid solver: ‘cholmod’, ‘cg’, ‘gmres’, ‘splu’, ‘spsolve’ (default: ‘cholmod’)
matrix_free (bool, optional) – Whether to use matrix-free operators (default: False, not yet implemented)
low_level_tol (float, optional) – Tolerance for coarse grid iterative solver (default: 1e-8)
low_level_maxiter (int, optional) – Max iterations for coarse grid iterative solver (default: 5000)
min_omega (float, optional) – Minimum omega for adaptive relaxation (default: 0.4)
omega_boost (float, optional) – Omega boost factor per level (default: 1.06)
Notes
Only works with structured meshes (uniform grid coarsening)
Memory efficient: No explicit matrix storage
Preconditioned CG with multigrid preconditioner
Adaptive omega based on residual
CHOLMOD coarse solver requires CPU transfer (fast for small coarse grids)
- Raises:
ValueError – If matrix_free=True with splu/spsolve coarse solver If coarse_solver not in [‘cg’, ‘gmres’, ‘splu’, ‘spsolve’, ‘cholmod’]
Examples
>>> from pyFANTOM.CUDA import StructuredMesh3D, StructuredStiffnessKernel, MultiGrid >>> mesh = StructuredMesh3D(nx=128, ny=128, nz=128, lx=1.0, ly=1.0, lz=1.0) >>> kernel = StructuredStiffnessKernel(mesh=mesh) >>> solver = MultiGrid(mesh=mesh, kernel=kernel, n_level=4, cycle='W')
- __init__(mesh: CuStructuredMesh2D | CuStructuredMesh3D, kernel: StiffnessKernel, maxiter=1000, tol=1e-05, n_smooth=3, omega=0.5, n_level=3, cycle='W', w_level=1, coarse_solver='cholmod', matrix_free=False, low_level_tol=1e-08, low_level_maxiter=5000, min_omega=0.4, omega_boost=1.06)[source]
- solve(rhs, rho=None, use_last=True)[source]
Solve linear system K(rho) @ U = rhs on GPU using multigrid-preconditioned CG.
- Parameters:
rhs (cp.ndarray) – Right-hand side force vector on GPU, shape (n_dof,)
rho (cp.ndarray, optional) – Design variables (densities) on GPU, shape (n_elements,)
use_last (bool, optional) – Use previous solution as initial guess (warm-starting), default: True
- Returns:
U (cp.ndarray) – Solution vector on GPU, shape (n_dof,)
residual (float) – Relative residual: ||K@U - rhs|| / ||rhs||
- Raises:
ValueError – If rho is not provided and kernel.has_rho is False
Notes
All operations performed on GPU using CuPy
First call builds multigrid hierarchy (expensive, ~1-5 seconds)
Subsequent calls reuse hierarchy (fast, ~0.1-1 second)
Uses preconditioned CG with multigrid V/W-cycle preconditioner
Adaptive omega adjusts relaxation based on residual reduction
Frees GPU memory after solve to prevent accumulation
Most memory-efficient solver for large 3D problems (>10M DOF)
- class pyFANTOM.Optimizers.CUDA.MMA.MMA(problem: Problem, sub_tol=1e-07, sub_maxiter=100, change_tol=0.0001, fun_tol=1e-06, move=0.5, timer=False)[source]
Bases:
OptimizerCUDA-accelerated Method of Moving Asymptotes (MMA) optimizer.
GPU version of MMA using CuPy arrays. Identical API to CPU version but operates on GPU memory for maximum performance in large-scale topology optimization.
- Parameters:
problem (Problem) – Optimization problem (e.g., MinimumCompliance) with CUDA backend
sub_tol (float, optional) – Sub-problem convergence tolerance (default: 1e-7)
sub_maxiter (int, optional) – Sub-problem max iterations (default: 100)
change_tol (float, optional) – Design variable change convergence tolerance (default: 1e-4)
fun_tol (float, optional) – Objective function change tolerance (default: 1e-6)
move (float, optional) – Move limit for design variables, range [0,1] (default: 0.5)
timer (bool, optional) – Return iteration time if True (default: False)
Notes
All arrays stored as CuPy arrays on GPU
5-10x faster than CPU for large problems (>1M DOF)
Requires CUDA-capable GPU and CuPy
Best for: General constraints, multi-material, multi-constraint problems
Examples
>>> from pyFANTOM.CUDA import MMA, MinimumCompliance >>> optimizer = MMA(problem=problem, sub_tol=1e-7, fun_tol=1e-6) >>> for i in range(100): >>> optimizer.iter() >>> if optimizer.converged(): >>> break
- __init__(problem: Problem, sub_tol=1e-07, sub_maxiter=100, change_tol=0.0001, fun_tol=1e-06, move=0.5, timer=False)[source]
Create an optimizer bound to a problem.
Parameters - problem: instance of
Problemproviding objective and gradients.
- iter()[source]
Perform one MMA optimization iteration on GPU.
Updates design variables by solving a convex sub-problem approximation of the original problem. Uses moving asymptotes to control step size. All operations performed on GPU using CuPy.
- Returns:
If timer=True, returns iteration time in seconds
- Return type:
float, optional
Notes
Solves convex sub-problem using mmasub() on GPU
Updates moving asymptotes (low, upp) for next iteration
Tracks design variable change and objective change
Handles NaN values by clipping to bounds
- converged(*args, **kwargs)[source]
Check if optimizer has converged.
- Returns:
True if convergence criteria are met: - Problem penalty continuation is complete (is_terminal() == True) - Design variable change <= change_tol - Objective function change <= fun_tol
- Return type:
Notes
Convergence requires both design change and objective change to be below tolerances. Also checks that penalty continuation (if used) has finished.
- logs()[source]
Return diagnostic information for current iteration.
- Returns:
Dictionary with keys: - ‘objective’: Current objective function value - ‘variable change’: L2 norm of design variable change - ‘function change’: Relative objective function change - Additional keys from problem.logs() (e.g., ‘iteration’, ‘residual’)
- Return type:
Notes
Used for monitoring optimization progress and convergence.
- class pyFANTOM.Optimizers.CUDA.OC.OC(problem: Problem, move: float = 0.2, change_tol=0.0001, fun_tol=1e-06, timer=False)[source]
Bases:
OptimizerCUDA-accelerated Optimality Criteria (OC) optimizer.
GPU version of OC using CuPy arrays. Fast heuristic optimizer specifically designed for topology optimization with independent volume constraints. Identical API to CPU version.
- Parameters:
problem (Problem) – Optimization problem (must have independent constraints) with CUDA backend
move (float, optional) – Move limit for design variables (default: 0.2)
change_tol (float, optional) – Convergence tolerance for design change (default: 1e-4)
fun_tol (float, optional) – Convergence tolerance for objective change (default: 1e-6)
timer (bool, optional) – Return iteration time (default: False)
Notes
All arrays stored as CuPy arrays on GPU
2-3x faster than CPU per iteration
Best for: Classic minimum compliance with volume constraint
Limited scope: Only works with independent constraints
Requires CUDA-capable GPU and CuPy
Examples
>>> from pyFANTOM.CUDA import OC >>> optimizer = OC(problem=problem, move=0.2) >>> for i in range(200): >>> optimizer.iter() >>> if optimizer.converged(): >>> break
- __init__(problem: Problem, move: float = 0.2, change_tol=0.0001, fun_tol=1e-06, timer=False)[source]
Create an optimizer bound to a problem.
Parameters - problem: instance of
Problemproviding objective and gradients.
- iter()[source]
Perform one OC optimization iteration on GPU.
Updates design variables using optimality criteria update formula with bisection to find Lagrange multiplier satisfying volume constraint. All operations performed on GPU using CuPy.
- Returns:
If timer=True, returns iteration time in seconds
- Return type:
float, optional
Notes
Computes OC parameter: ocP = x * sqrt(-df/dg)
Uses bisection to find Lagrange multiplier lambda
Updates: x_new = clip(xL, xU, ocP/lambda)
Enforces move limits: xL = x - move, xU = x + move
- converged(*args, **kwargs)[source]
Check if optimizer has converged.
- Returns:
True if convergence criteria are met: - Problem penalty continuation is complete (is_terminal() == True) - Design variable change <= change_tol - Objective function change <= fun_tol
- Return type:
Notes
Convergence requires both design change and objective change to be below tolerances. Also checks that penalty continuation (if used) has finished.
- logs()[source]
Return diagnostic information for current iteration.
- Returns:
Dictionary with keys: - ‘objective’: Current objective function value - ‘variable change’: L2 norm of design variable change - ‘function change’: Relative objective function change - Additional keys from problem.logs() (e.g., ‘iteration’, ‘residual’)
- Return type:
Notes
Used for monitoring optimization progress and convergence.
- class pyFANTOM.Optimizers.CUDA.PGD.PGD(problem: Problem, change_tol=0.0001, fun_tol=1e-06, maxiter_N=50, tol_B=1e-08, tol_N=1e-06, C=1000000000000.0, fall_back_move=0.2, alpha_max=100.0, relaxation=1.0, warmup_iter=50, timer=False, conjugate_directions=True)[source]
Bases:
OptimizerCUDA-accelerated Projected Gradient Descent optimizer.
GPU version of PGD using CuPy arrays. Fast first-order optimizer using BFGS-based step size adaptation and projection onto feasible set. Identical API to CPU version.
- Parameters:
problem (Problem) – Optimization problem with CUDA backend
change_tol (float, optional) – Design variable change tolerance (default: 1e-4)
fun_tol (float, optional) – Objective function change tolerance (default: 1e-6)
maxiter_N (int, optional) – Max iterations for Newton’s method in projection (default: 50)
tol_B (float, optional) – Bisection tolerance (default: 1e-8)
tol_N (float, optional) – Newton solver tolerance (default: 1e-6)
C (float, optional) – Large constant for penalty (default: 1e12)
fall_back_move (float, optional) – Fallback step size when BFGS fails (default: 0.2)
alpha_max (float, optional) – Maximum step size (default: 1e2)
relaxation (float, optional) – Step size relaxation parameter (default: 1.0)
warmup_iter (int, optional) – Iterations before enforcing strict feasibility (default: 50)
timer (bool, optional) – Return iteration time (default: False)
conjugate_directions (bool, optional) – Use conjugate gradient directions (default: True)
Notes
All arrays stored as CuPy arrays on GPU
Faster than CPU: ~30-50% faster per iteration
Best for: Simple bounds, single constraint problems
Less robust: May struggle with multiple nonlinear constraints
Requires CUDA-capable GPU and CuPy
Examples
>>> from pyFANTOM.CUDA import PGD >>> optimizer = PGD(problem=problem, conjugate_directions=True) >>> for i in range(100): >>> optimizer.iter() >>> if optimizer.converged(): >>> break
- __init__(problem: Problem, change_tol=0.0001, fun_tol=1e-06, maxiter_N=50, tol_B=1e-08, tol_N=1e-06, C=1000000000000.0, fall_back_move=0.2, alpha_max=100.0, relaxation=1.0, warmup_iter=50, timer=False, conjugate_directions=True)[source]
Create an optimizer bound to a problem.
Parameters - problem: instance of
Problemproviding objective and gradients.
- iter()[source]
Perform one PGD optimization iteration on GPU.
Updates design variables using projected gradient descent with adaptive step size (BFGS-based) and projection onto feasible set. All operations performed on GPU using CuPy.
- Returns:
If timer=True, returns iteration time in seconds
- Return type:
float, optional
Notes
Computes step size using BFGS approximation (alpha)
Uses conjugate gradient directions if enabled
Projects update onto feasible set (satisfies constraints)
For independent constraints: uses bisection
For coupled constraints: uses Newton’s method with Wolfe line search
- converged(*args, **kwargs)[source]
Check if optimizer has converged.
- Returns:
True if convergence criteria are met: - Problem penalty continuation is complete (is_terminal() == True) - Design variable change <= change_tol - Objective function change <= fun_tol
- Return type:
Notes
Convergence requires both design change and objective change to be below tolerances. Also checks that penalty continuation (if used) has finished.
- logs()[source]
Return diagnostic information for current iteration.
- Returns:
Dictionary with keys: - ‘objective’: Current objective function value - ‘variable change’: L2 norm of design variable change - ‘function change’: Relative objective function change - Additional keys from problem.logs() (e.g., ‘iteration’, ‘residual’)
- Return type:
Notes
Used for monitoring optimization progress and convergence.
- class pyFANTOM.Problem.CUDA.MinimumCompliance.MinimumCompliance(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]
Bases:
ProblemCUDA-accelerated minimum compliance topology optimization problem.
GPU version of MinimumCompliance using CuPy arrays. Identical API to CPU version but operates on GPU memory for maximum performance in large-scale optimization.
- Parameters:
FE (FiniteElement) – CUDA finite element analysis engine with mesh, kernel, and solver
filter (StructuredFilter2D, StructuredFilter3D, or GeneralFilter) – CUDA density filter for ensuring minimum feature sizes
E_mul (list of float, optional) – Young’s modulus multipliers for each material (default: [1.0] for single material)
void (float, optional) – Minimum density to avoid singularity (default: 1e-6)
penalty (float, optional) – SIMP penalization exponent (default: 3.0). Higher = more binary designs
volume_fraction (list of float, optional) – Volume fraction constraint for each material (default: [0.25])
penalty_schedule (callable, optional) – Function(p, iteration) for penalty continuation. If None, uses constant penalty
heavyside (bool, optional) – Apply Heaviside projection for sharper 0-1 designs (default: True)
beta (float or callable, optional) – Heaviside projection sharpness parameter. Can be a float (default: 2) or a callable function of iteration: beta(iteration) -> float. Enables beta continuation for gradual Heaviside sharpening during optimization.
eta (float, optional) – Heaviside projection threshold (default: 0.5)
Notes
All arrays stored as CuPy arrays on GPU
5-10x faster than CPU for large problems (>1M DOF)
Requires CUDA-capable GPU and CuPy
Identical SIMP formulation and Heaviside projection as CPU version
Examples
>>> from pyFANTOM.CUDA import * >>> mesh = StructuredMesh2D(nx=256, ny=128, lx=2.0, ly=1.0) >>> kernel = StructuredStiffnessKernel(mesh=mesh) >>> solver = CG(kernel=kernel) >>> FE = FiniteElement(mesh=mesh, kernel=kernel, solver=solver) >>> filter = StructuredFilter2D(mesh=mesh, r_min=1.5) >>> problem = MinimumCompliance(FE=FE, filter=filter, penalty=3.0, volume_fraction=[0.3])
- __init__(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]
Initialize problem state.
Subclasses may accept finite-element handlers, filters, and other configuration parameters.
- N()[source]
Return the number of design variables.
- Returns:
Total number of design variables (n_elements * n_materials)
- Return type:
- m()[source]
Return the number of constraints.
- Returns:
Number of volume constraints (1 for single material, n_materials for multi-material)
- Return type:
- is_independent()[source]
Check if constraints are independent (required for OC optimizer).
- Returns:
True if constraints are independent (always True for MinimumCompliance)
- Return type:
- constraint_map()[source]
Return mapping of constraints to design variables.
- Returns:
Single material: 1 (scalar)
Multi-material: array of shape (n_materials, n_vars) with 1s indicating which design variables belong to each material constraint
- Return type:
int or cp.ndarray
Notes
Used by optimizers to identify which design variables affect each constraint. For multi-material, constraint i affects variables [i*n_elements:(i+1)*n_elements].
- bounds()[source]
Return bounds for design variables.
- Returns:
(lower_bound, upper_bound) where both are 0.0 and 1.0 respectively
- Return type:
Notes
Design variables represent normalized densities in [0, 1].
- visualize_problem(**kwargs)[source]
Visualize problem setup (mesh, BCs, loads).
- Parameters:
**kwargs – Arguments passed to FE.visualize_problem()
- Returns:
Plot object
- Return type:
matplotlib.axes.Axes or k3d.Plot
Notes
GPU arrays are automatically transferred to CPU for visualization.
- visualize_solution(**kwargs)[source]
Visualize optimized design (density distribution).
- Parameters:
**kwargs – Arguments passed to FE.visualize_density()
- Returns:
Plot object
- Return type:
matplotlib.axes.Axes or k3d.Plot
Notes
Shows current design variables as density field. For multi-material problems, displays combined material distribution. GPU arrays transferred to CPU for visualization.
- init_desvars()[source]
Initialize design variables to uniform density at volume fraction.
Sets all design variables to the volume fraction constraint value and performs initial FEA solve to compute objective and constraints.
Notes
Single material: all variables set to volume_fraction
Multi-material: all variables set to min(volume_fraction) for each material
Resets iteration counter to 0
Triggers _compute() to evaluate objective and constraints
All operations performed on GPU
- set_desvars(desvars: cupy.ndarray)[source]
Set design variables and recompute objective/constraints.
- Parameters:
desvars (cp.ndarray) – Design variables on GPU, shape (n_vars,). Values should be in [0, 1]
- Raises:
ValueError – If desvars shape doesn’t match num_vars
Notes
Triggers FEA solve and sensitivity computation on GPU
Increments iteration counter
Updates cached values: _f, _g, _nabla_f, _nabla_g
- get_desvars()[source]
Get current design variables.
- Returns:
Current design variables on GPU, shape (n_vars,)
- Return type:
cp.ndarray
Notes
Returns raw (unfiltered) design variables. For filtered densities, access via problem.filter.dot(problem.get_desvars()).
- f(rho: cupy.ndarray | None = None)[source]
Compute objective function value (compliance).
- Parameters:
rho (cp.ndarray, optional) – Design variables on GPU for linearization. If None, returns cached value
- Returns:
Compliance value (F^T @ U). Lower is better (stiffer structure).
- Return type:
Notes
If rho provided: returns linearized approximation f(x) + df/dx @ (rho - x)
If rho is None: returns cached value from last set_desvars() call
Compliance = F^T @ U = U^T @ K @ U (strain energy)
All operations performed on GPU
- nabla_f(rho: cupy.ndarray | None = None)[source]
Compute objective function gradient (compliance sensitivities).
- Parameters:
rho (cp.ndarray, optional) – Unused (for interface compatibility)
- Returns:
Gradient of compliance w.r.t. design variables on GPU, shape (n_vars,)
- Return type:
cp.ndarray
Notes
Uses adjoint method: dC/drho = -U^T @ dK/drho @ U
Includes filter adjoint: sens_raw = H^T @ sens_filtered
Negative gradient means increasing density reduces compliance (good)
All operations performed on GPU
- g(rho=None)[source]
Compute constraint values (volume fraction violations).
- Parameters:
rho (cp.ndarray, optional) – Design variables on GPU. If None, uses current desvars
- Returns:
Constraint violations, shape (n_constraints,). Negative = satisfied. g[i] = (volume_fraction[i] - actual_volume_fraction[i])
- Return type:
cp.ndarray
Notes
Constraint satisfied when g <= 0
For single material: returns scalar
For multi-material: returns array with one constraint per material
All operations performed on GPU
- nabla_g()[source]
Compute constraint gradients (volume fraction sensitivities).
- Returns:
Gradient of constraints w.r.t. design variables on GPU. - Single material: shape (n_vars,) - Multi-material: shape (n_materials, n_vars)
- Return type:
cp.ndarray
Notes
Each row is d(volume_fraction[i])/drho
For uniform elements, gradient is constant: 1/volume_total per element
Used by optimizers to enforce volume constraints
- ill_conditioned()[source]
Check if FEA system is ill-conditioned.
- Returns:
True if residual >= 1e-2 (indicates poor solver convergence)
- Return type:
Notes
Residual > 1e-2 suggests numerical issues (check void parameter, penalty)
May indicate near-singular stiffness matrix (too many void elements)
Consider increasing void parameter or reducing penalty
- is_terminal()[source]
Check if penalty continuation has reached final value.
- Returns:
True if penalty schedule is complete or not used
- Return type:
Notes
Used by optimizers to determine if continuation is finished
If penalty_schedule is None, always returns True
If penalty_schedule exists, checks if current iteration has reached final penalty
- class pyFANTOM.Problem.CUDA.ComplianceConstrainedMinimumVolume.ComplianceConstrainedMinimumVolume(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E: float = 1.0, void: float = 1e-06, penalty: float = 3.0, compliance_limit: float = 0.25, penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]
Bases:
ProblemMinimum volume topology optimization with compliance constraint (CUDA).
GPU-accelerated version of ComplianceConstrainedMinimumVolume. Inverse formulation of MinimumCompliance: minimize material usage while maintaining structure stiffness above threshold. All computations performed on GPU using CuPy.
- Parameters:
FE (FiniteElement) – CUDA finite element analysis engine
filter (StructuredFilter2D, StructuredFilter3D, or GeneralFilter) – CUDA density filter
E (float, optional) – Young’s modulus (default: 1.0)
void (float, optional) – Minimum density to avoid singularity (default: 1e-6)
penalty (float, optional) – SIMP penalization exponent (default: 3.0)
compliance_limit (float, optional) – Maximum allowable compliance (normalized by initial compliance) (default: 0.25)
penalty_schedule (callable, optional) – Penalty continuation function(p, iteration)
heavyside (bool, optional) – Apply Heaviside projection (default: True)
beta (float or callable, optional) – Heaviside projection sharpness parameter. Can be a float (default: 2) or a callable function of iteration: beta(iteration) -> float. Enables beta continuation for gradual Heaviside sharpening during optimization.
eta (float, optional) – Heaviside threshold (default: 0.5)
Notes
Objective: Minimize volume
Constraint: Compliance ≤ compliance_limit * initial_compliance
Use case: Lightweight design, material cost minimization
Comparison: More challenging than MinimumCompliance (compliance constraint is nonlinear)
Typical compliance_limit: 0.2-0.5 of initial design
GPU Acceleration: All arrays stored as CuPy arrays on GPU
Performance: 5-10x faster than CPU for large problems
Examples
>>> from pyFANTOM.CUDA import * >>> problem = ComplianceConstrainedMinimumVolume( >>> FE=FE, filter=filter, >>> compliance_limit=0.3, # 30% of initial compliance >>> penalty=3.0 >>> )
- __init__(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, E: float = 1.0, void: float = 1e-06, penalty: float = 3.0, compliance_limit: float = 0.25, penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]
Initialize problem state.
Subclasses may accept finite-element handlers, filters, and other configuration parameters.
- N()[source]
Return the number of design variables.
- Returns:
Total number of design variables (n_elements)
- Return type:
- m()[source]
Return the number of constraints.
- Returns:
Number of constraints (1: compliance constraint)
- Return type:
- is_independent()[source]
Check if constraints are independent (required for OC optimizer).
- Returns:
True if constraints are independent (always True for ComplianceConstrainedMinimumVolume)
- Return type:
- constraint_map()[source]
Return mapping of constraints to design variables.
- Returns:
Scalar value 1 indicating all design variables affect the compliance constraint
- Return type:
Notes
Used by optimizers to identify which design variables affect the constraint. For this problem, all variables affect compliance.
- bounds()[source]
Return bounds for design variables.
- Returns:
(lower_bound, upper_bound) where both are 0.0 and 1.0 respectively
- Return type:
Notes
Design variables represent normalized densities in [0, 1].
- visualize_problem(**kwargs)[source]
Visualize problem setup (mesh, BCs, loads).
- Parameters:
**kwargs – Arguments passed to FE.visualize_problem()
- Returns:
Plot object
- Return type:
matplotlib.axes.Axes or k3d.Plot
- visualize_solution(**kwargs)[source]
Visualize optimized design (density distribution).
- Parameters:
**kwargs – Arguments passed to FE.visualize_density()
- Returns:
Plot object
- Return type:
matplotlib.axes.Axes or k3d.Plot
Notes
Shows current design variables as density field.
- init_desvars()[source]
Initialize design variables to full density (volume = 1.0).
Sets all design variables to 1.0 (full material) and performs initial FEA solve to compute objective and constraints. This ensures initial design satisfies compliance constraint.
Notes
All variables set to 1.0 (full material)
Resets iteration counter to 0
Triggers _compute() to evaluate objective and constraints
- set_desvars(desvars: cupy.ndarray)[source]
Set design variables and recompute objective/constraints.
- Parameters:
desvars (cupy.ndarray) – Design variables, shape (n_vars,). Values should be in [0, 1]
- Raises:
ValueError – If desvars shape doesn’t match num_vars
Notes
Triggers FEA solve and sensitivity computation
Increments iteration counter
Updates cached values: _f, _g, _nabla_f, _nabla_g
- get_desvars()[source]
Get current design variables.
- Returns:
Current design variables, shape (n_vars,)
- Return type:
cupy.ndarray
Notes
Returns raw (unfiltered) design variables. For filtered densities, access via problem.filter.dot(problem.get_desvars()).
- penalize(rho: cupy.ndarray)[source]
Apply SIMP penalization and Heaviside projection.
- Parameters:
rho (cupy.ndarray) – Filtered densities, shape (n_elements,)
- Returns:
Penalized Young’s moduli, shape (n_elements,)
- Return type:
cupy.ndarray
Notes
Applies Heaviside projection if enabled
Applies SIMP: E = E0 * rho^penalty
Clips to void value to avoid singularity
Uses penalty_schedule if provided
Uses beta_schedule if beta is callable
- penalize_grad(rho: cupy.ndarray)[source]
Compute gradient of penalization function.
- Parameters:
rho (cupy.ndarray) – Filtered densities, shape (n_elements,)
- Returns:
Gradient dE/drho, shape (n_elements,)
- Return type:
cupy.ndarray
Notes
Derivative of SIMP + Heaviside projection
Used in chain rule for sensitivity analysis
Accounts for penalty_schedule if provided
Accounts for beta_schedule if beta is callable
- f(rho: cupy.ndarray | None = None)[source]
Compute objective function value (volume).
- Parameters:
rho (cupy.ndarray, optional) – Design variables for linearization. If None, returns cached value
- Returns:
Volume fraction (normalized by total domain volume). Lower is better.
- Return type:
Notes
If rho provided: returns linearized approximation f(x) + df/dx @ (rho - x)
If rho is None: returns cached value from last set_desvars() call
Objective is to minimize material usage while satisfying compliance constraint
- nabla_f(rho: cupy.ndarray | None = None)[source]
Compute objective function gradient (volume sensitivities).
- Parameters:
rho (cupy.ndarray, optional) – Unused (for interface compatibility)
- Returns:
Gradient of volume w.r.t. design variables, shape (n_vars,)
- Return type:
cupy.ndarray
Notes
Constant gradient: 1/volume_total per element
All elements contribute equally to volume
Used by optimizers to minimize material usage
- g(rho=None)[source]
Compute constraint values (compliance constraint violations).
- Parameters:
rho (cupy.ndarray, optional) – Design variables. If None, uses current desvars
- Returns:
Normalized compliance constraint violation. Negative = satisfied. g = (compliance - compliance_limit) / compliance_limit
- Return type:
Notes
Constraint satisfied when g <= 0
Normalized by compliance_limit for better scaling
If rho provided, uses linearized approximation
- nabla_g()[source]
Compute constraint gradients (compliance sensitivities).
- Returns:
Gradient of compliance constraint w.r.t. design variables, shape (n_vars,)
- Return type:
cupy.ndarray
Notes
Normalized by compliance_limit for better scaling
Uses adjoint method: dC/drho = -U^T @ dK/drho @ U
Includes filter adjoint: sens_raw = H^T @ sens_filtered
Used by optimizers to enforce compliance constraint
- ill_conditioned()[source]
Check if FEA system is ill-conditioned.
- Returns:
True if residual >= 1e-2 (indicates poor solver convergence)
- Return type:
Notes
Residual > 1e-2 suggests numerical issues (check void parameter, penalty)
May indicate near-singular stiffness matrix (too many void elements)
Consider increasing void parameter or reducing penalty
- is_terminal()[source]
Check if penalty continuation has reached final value.
- Returns:
True if penalty schedule is complete or not used
- Return type:
Notes
Used by optimizers to determine if continuation is finished
If penalty_schedule is None, always returns True
If penalty_schedule exists, checks if current iteration has reached final penalty
- class pyFANTOM.Problem.CUDA.WeightDistributionMinimumCompliance.WeightDistributionMinimumCompliance(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, target_centroid: List[float], maximum_distance: float, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]
Bases:
ProblemMinimum compliance with weight distribution constraint (CUDA).
GPU-accelerated version of WeightDistributionMinimumCompliance. Extends MinimumCompliance with an additional constraint on the centroid location of the material distribution. Useful for balancing structures or controlling center of mass for dynamic applications. All computations performed on GPU using CuPy.
- Parameters:
FE (FiniteElement) – CUDA finite element analysis engine with mesh, kernel, and solver
filter (StructuredFilter2D, StructuredFilter3D, or GeneralFilter) – CUDA density filter for ensuring minimum feature sizes
target_centroid (List[float]) – Target centroid location in physical coordinates, shape (2,) for 2D or (3,) for 3D
maximum_distance (float) – Maximum allowed distance from target centroid (constraint: ||centroid - target|| <= maximum_distance)
E_mul (list of float, optional) – Young’s modulus multipliers for each material (default: [1.0] for single material)
void (float, optional) – Minimum density to avoid singularity (default: 1e-6)
penalty (float, optional) – SIMP penalization exponent (default: 3.0). Higher = more binary designs
volume_fraction (list of float, optional) – Volume fraction constraint for each material (default: [0.25])
penalty_schedule (callable, optional) – Function(p, iteration) for penalty continuation. If None, uses constant penalty
heavyside (bool, optional) – Apply Heaviside projection for sharper 0-1 designs (default: True)
beta (float or callable, optional) – Heaviside projection sharpness parameter. Can be a float (default: 2) or a callable function of iteration: beta(iteration) -> float. Enables beta continuation for gradual Heaviside sharpening during optimization.
eta (float, optional) – Heaviside projection threshold (default: 0.5)
- target_centroid
Target centroid location
- Type:
cupy.ndarray
Notes
Constraints: - Volume constraint: Same as MinimumCompliance - Centroid constraint: ||centroid(rho) - target_centroid|| <= maximum_distance - Constraints are coupled (not independent), so OC optimizer cannot be used
Centroid Computation: - Centroid = sum(rho[i] * volume[i] * centroid[i]) / sum(rho[i] * volume[i]) - Weighted by element density and volume
Use Cases: - Balancing structures for dynamic loads - Controlling center of mass - Symmetric designs with off-center loading - Multi-material designs with weight distribution requirements
GPU Acceleration: - All arrays stored as CuPy arrays on GPU - 5-10x faster than CPU for large problems
Examples
>>> from pyFANTOM.CUDA import * >>> # Setup FEA >>> mesh = StructuredMesh2D(nx=128, ny=64, lx=2.0, ly=1.0) >>> kernel = StructuredStiffnessKernel(mesh=mesh) >>> solver = CG(kernel=kernel) >>> FE = FiniteElement(mesh=mesh, kernel=kernel, solver=solver) >>> >>> # Apply BCs and loads >>> FE.add_dirichlet_boundary_condition(node_ids=fixed_nodes, rhs=0) >>> FE.add_point_forces(node_ids=load_nodes, forces=cp.array([[0, -1.0]])) >>> >>> # Define optimization problem with centroid constraint >>> filter = StructuredFilter2D(mesh=mesh, r_min=1.5) >>> problem = WeightDistributionMinimumCompliance( >>> FE=FE, filter=filter, >>> target_centroid=[1.0, 0.5], # Center of 2D domain >>> maximum_distance=0.1, # 10% of domain size >>> penalty=3.0, volume_fraction=[0.3] >>> )
- __init__(FE: FiniteElement, filter: CuStructuredFilter2D | CuStructuredFilter3D | CuGeneralFilter, target_centroid: List[float], maximum_distance: float, E_mul: list[float] = [1.0], void: float = 1e-06, penalty: float = 3.0, volume_fraction: list[float] = [0.25], penalty_schedule: Callable[[float, int], float] | None = None, heavyside: bool = True, beta: float | Callable[[int], float] = 2, eta: float = 0.5)[source]
Initialize problem state.
Subclasses may accept finite-element handlers, filters, and other configuration parameters.
- N()[source]
Return the number of design variables.
- Returns:
Total number of design variables (n_elements * n_materials)
- Return type:
- m()[source]
Return the number of constraints.
- Returns:
Number of constraints: - Single material: 2 (volume + centroid distance) - Multi-material: n_materials + 1 (volume per material + centroid distance)
- Return type:
- is_independent()[source]
Check if constraints are independent (required for OC optimizer).
- Returns:
Always False. Centroid constraint couples all design variables, making constraints non-independent. Use MMA or PGD optimizer instead of OC.
- Return type:
- constraint_map()[source]
Return mapping of constraints to design variables.
- Returns:
Single material: 1 (scalar, all variables affect both constraints)
Multi-material: array of shape (n_materials + 1, n_vars) where: - Rows 0 to n_materials-1: volume constraints per material - Last row: centroid constraint (all variables = 1)
- Return type:
int or ndarray
Notes
Used by optimizers to identify which design variables affect each constraint. The centroid constraint depends on all design variables, making constraints coupled.
- bounds()[source]
Return bounds for design variables.
- Returns:
(lower_bound, upper_bound) where both are 0.0 and 1.0 respectively
- Return type:
Notes
Design variables represent normalized densities in [0, 1].
- visualize_problem(**kwargs)[source]
Visualize problem setup (mesh, BCs, loads).
- Parameters:
**kwargs – Arguments passed to FE.visualize_problem()
- Returns:
Plot object
- Return type:
matplotlib.axes.Axes or k3d.Plot
- visualize_solution(**kwargs)[source]
Visualize optimized design (density distribution).
- Parameters:
**kwargs – Arguments passed to FE.visualize_density()
- Returns:
Plot object
- Return type:
matplotlib.axes.Axes or k3d.Plot
Notes
Shows current design variables as density field. For multi-material problems, displays combined material distribution.
- init_desvars()[source]
Initialize design variables with random smoothed distribution.
Uses random initialization followed by triple filtering to create a smooth initial guess. This helps avoid getting stuck in poor local minima that may violate the centroid constraint.
Notes
Random initialization: uniform distribution in [min(0, 2*vf-1), 1.0]
Triple filtering: applies filter 3 times for smoothness
Normalized to match volume fraction constraint
Resets iteration counter to 0
Triggers _compute() to evaluate objective and constraints
- set_desvars(desvars: cupy.ndarray)[source]
Set design variables and recompute objective/constraints.
- Parameters:
desvars (cupy.ndarray) – Design variables, shape (n_vars,). Values should be in [0, 1]
- Raises:
ValueError – If desvars shape doesn’t match num_vars
Notes
Triggers FEA solve and sensitivity computation
Increments iteration counter
Updates cached values: _f, _g, _nabla_f, _nabla_g
Computes centroid and centroid constraint gradient
- get_desvars()[source]
Get current design variables.
- Returns:
Current design variables, shape (n_vars,)
- Return type:
cupy.ndarray
Notes
Returns raw (unfiltered) design variables. For filtered densities, access via problem.filter.dot(problem.get_desvars()).
- penalize(rho: cupy.ndarray)[source]
Apply SIMP penalization and Heaviside projection.
- Parameters:
rho (cupy.ndarray) – Filtered densities, shape (n_elements,) or (n_elements, n_materials)
- Returns:
Penalized Young’s moduli, shape (n_elements,)
- Return type:
cupy.ndarray
Notes
Applies Heaviside projection if enabled
Applies SIMP: E = E0 * rho^penalty
Clips to void value to avoid singularity
Uses penalty_schedule if provided
Uses beta_schedule if beta is callable
For multi-material: uses material interpolation scheme
- penalize_grad(rho: cupy.ndarray)[source]
Compute gradient of penalization function.
- Parameters:
rho (cupy.ndarray) – Filtered densities, shape (n_elements,) or (n_elements, n_materials)
- Returns:
Gradient dE/drho, shape (n_elements,) or (n_elements, n_materials)
- Return type:
cupy.ndarray
Notes
Derivative of SIMP + Heaviside projection
Used in chain rule for sensitivity analysis
Accounts for penalty_schedule if provided
Accounts for beta_schedule if beta is callable
For multi-material: returns gradient per material
- f(rho: cupy.ndarray | None = None)[source]
Compute objective function value (compliance).
- Parameters:
rho (cupy.ndarray, optional) – Design variables for linearization. If None, returns cached value
- Returns:
Compliance value (F^T @ U). Lower is better (stiffer structure).
- Return type:
Notes
If rho provided: returns linearized approximation f(x) + df/dx @ (rho - x)
If rho is None: returns cached value from last set_desvars() call
Compliance = F^T @ U = U^T @ K @ U (strain energy)
- nabla_f(rho: cupy.ndarray | None = None)[source]
Compute objective function gradient (compliance sensitivities).
- Parameters:
rho (cupy.ndarray, optional) – Unused (for interface compatibility)
- Returns:
Gradient of compliance w.r.t. design variables, shape (n_vars,)
- Return type:
cupy.ndarray
Notes
Uses adjoint method: dC/drho = -U^T @ dK/drho @ U
Includes filter adjoint: sens_raw = H^T @ sens_filtered
Negative gradient means increasing density reduces compliance (good)
- g(rho=None)[source]
Compute constraint values (volume + centroid distance violations).
- Parameters:
rho (cupy.ndarray, optional) – Design variables. If None, uses current desvars
- Returns:
Constraint violations, shape (n_constraints,). Negative = satisfied. - First n_materials constraints: volume fraction violations - Last constraint: centroid distance violation (distance^2 - maximum_distance^2)
- Return type:
cupy.ndarray
Notes
Constraint satisfied when g <= 0
For single material: returns [volume_violation, centroid_violation]
For multi-material: returns [vol_viol_1, …, vol_viol_n, centroid_violation]
If rho provided, uses linearized approximation
- nabla_g()[source]
Compute constraint gradients (volume + centroid sensitivities).
- Returns:
Gradient of constraints w.r.t. design variables. - Single material: shape (2, n_vars) - [volume_grad, centroid_grad] - Multi-material: shape (n_materials + 1, n_vars) - [vol_grad_1, …, vol_grad_n, centroid_grad]
- Return type:
cupy.ndarray
Notes
First rows: d(volume_fraction[i])/drho (constant for uniform elements)
Last row: d(centroid_distance^2)/drho (depends on element positions)
Centroid gradient accounts for weighted average of element centroids
Used by optimizers to enforce constraints
- ill_conditioned()[source]
Check if FEA system is ill-conditioned.
- Returns:
True if residual >= 1e-2 (indicates poor solver convergence)
- Return type:
Notes
Residual > 1e-2 suggests numerical issues (check void parameter, penalty)
May indicate near-singular stiffness matrix (too many void elements)
Consider increasing void parameter or reducing penalty
- is_terminal()[source]
Check if penalty continuation has reached final value.
- Returns:
True if penalty schedule is complete or not used
- Return type:
Notes
Used by optimizers to determine if continuation is finished
If penalty_schedule is None, always returns True
If penalty_schedule exists, checks if current iteration has reached final penalty