Purpose¶
The data access system provides a uniform interface for reading and writing field data on meshes and swarms while maintaining compatibility with PETSc’s parallel infrastructure. The design enables direct array manipulation with automatic synchronization, eliminating the need for explicit context managers while preserving solver performance.
Current Implementation (2025+)¶
Direct Array Access¶
Variables now provide direct array access through the array property, which returns an NDArray_With_Callback object that automatically handles PETSc synchronization:
# MeshVariable - direct access
temperature.array[...] = initial_conditions
# SwarmVariable - direct access
material.array[:, 0, 0] = material_indices
# No context manager needed
velocity.array[mask] = boundary_values
The NDArray_With_Callback class wraps NumPy arrays with callbacks that synchronize changes to PETSc’s internal storage. This provides NumPy’s convenience while maintaining PETSc’s parallel correctness.
Array Shapes¶
Variables follow a consistent shape convention:
array property:
(N, a, b)formatScalars:
(N, 1, 1)Vectors:
(N, 1, dim)where dim=2 or 3Tensors:
(N, dim, dim)Symmetric tensors:
(N, dim, dim)(unpacked)
data property:
(-1, num_components)flat formatScalars:
(N, 1)Vectors:
(N, dim)Tensors:
(N, dim²)Symmetric tensors:
(N, 6)for 3D (packed)
The data property provides backward compatibility with code expecting flat arrays.
Batch Operations¶
For operations updating multiple variables simultaneously, use the synchronised_array_update() context manager:
with uw.synchronised_array_update():
temperature.array[...] = T_values
velocity.array[...] = V_values
pressure.array[...] = P_values
# All synchronization happens here atomically
This defers PETSc synchronization until all assignments complete, ensuring atomic updates and better performance for batch operations.
Solver Integration¶
Solvers access data through the vec property, which provides direct PETSc vector access:
# Solver internal code
petsc_vec = variable.vec
petsc_vec.assemble()
This separation preserves solver performance. User code uses array, solver code uses vec, and the system handles synchronization between them.
Architecture Details¶
NDArray_With_Callback¶
The callback mechanism wraps NumPy arrays with synchronization logic:
class NDArray_With_Callback:
def __init__(self, data, callback):
self._data = data
self._callback = callback
def __setitem__(self, key, value):
self._data[key] = value
if self._callback:
self._callback(self._data)
When array data changes, the callback updates the corresponding PETSc vector. This ensures consistency between NumPy-style manipulation and PETSc’s parallel infrastructure.
Lazy Proxy Updates (Swarm Variables)¶
Swarm variables with proxy mesh variables use lazy evaluation to avoid PETSc access conflicts:
# Data write triggers lazy update
swarm_var.array[...] = values # Marks proxy as stale
# Proxy updates only when accessed
symbolic_expr = swarm_var.sym # Triggers RBF interpolation if stale
This pattern prevents circular dependencies when callbacks would trigger nested PETSc field access.
Vector Availability¶
Variables set _available=True by default, ensuring solvers can access vectors without modification. Lazy initialization creates vectors on first access:
@property
def vec(self):
if self._lvec is None:
self._set_vec(available=True)
return self._lvec
This guarantees solver compatibility while maintaining the simple user interface.
Data Property Caching and Self-Validation¶
The .data property on _BaseMeshVariable caches an NDArray_With_Callback object (_canonical_data) that wraps a NumPy view into the PETSc local vector (_lvec.array). This cache avoids recreating the view on every access, which would cause PETSc field access conflicts.
The stale cache problem: The cached view becomes invalid when _lvec is destroyed and recreated. This happens in two situations:
DM rebuild: When a new MeshVariable is added to a mesh that already has fields, PETSc requires a new DM. All existing variables’ vectors are destroyed and recreated from the new DM (see
_setup_ds()indiscretisation_mesh_variables.py).Mesh adaptation: When
mesh.adapt()replaces the mesh, all variables get new vectors on the adapted mesh.
If the cached _canonical_data still references the old (destroyed) _lvec, it reads freed memory and returns zeros — even though the solver correctly wrote results to the new _lvec.
Self-validating cache: The .data property tracks id(self._lvec) when the cache is created. On every access, it verifies the cached view still corresponds to the current _lvec:
@property
def data(self):
cache_valid = (
"_canonical_data" in self.__dict__
and self._canonical_data is not None
and "_canonical_data_lvec_id" in self.__dict__
and self._canonical_data_lvec_id == id(self._lvec)
)
if not cache_valid:
self._canonical_data = self._create_canonical_data_array()
self._canonical_data_lvec_id = id(self._lvec)
return self._canonical_data
This makes the cache self-healing — no code path that replaces _lvec needs to manually invalidate _canonical_data. The DM rebuild loop and mesh.adapt() also eagerly clear _canonical_data as a performance optimization, but correctness does not depend on it.
Warning
Variable Creation After Data Access
Creating new MeshVariables on a mesh triggers a DM rebuild that replaces all existing variables’ PETSc vectors. Code that accesses .data before all variables are created will get a stale cache automatically healed on the next access — but the old NumPy array reference becomes invalid. Always re-read .data after creating new variables.
Performance Considerations¶
Direct Access Overhead¶
The NDArray_With_Callback approach adds minimal overhead. Each array assignment triggers a callback, but this is necessary regardless of API style to maintain PETSc consistency. Benchmarks show negligible performance difference compared to context manager patterns.
Batch Update Benefits¶
The synchronised_array_update() context provides performance benefits by:
Deferring PETSc synchronization until batch completion
Reducing redundant parallel communication
Ensuring atomic updates across multiple variables
Including MPI barriers for proper parallel coordination
Solver Performance Preservation¶
Important
Solver Code Unchanged
Solver internals continue using direct vec property access. No performance impact from data access API changes.
Implementation Files¶
Key implementation locations:
MeshVariable:
discretisation_mesh_variables.pyarrayproperty (lines 700-720)dataproperty for compatibility (lines 680-700)Callback registration in
_create_variable_array()
SwarmVariable:
swarm.pyarrayproperty with caching (lines 857-871)Lazy proxy updates via
_update_proxy_if_stale()(lines 424-445)Flat data property for compatibility (lines 832-854)
NDArray_With_Callback:
utilities/_utils.pyCore wrapper class
Global callback delay mechanism
Data synchronization callbacks
Migration Guide (Legacy Patterns)¶
Access Context Manager (Pre-2025)¶
Previously, data access required explicit context managers:
# OLD: Context manager required
with mesh.access(temperature):
temperature.data[...] = values
with swarm.access(material):
material.data[...] = indices
The context manager handled PETSc vector retrieval and restoration. This pattern is no longer necessary - direct array access is now the standard approach.
Migration Steps¶
To update legacy code:
Remove context managers:
# Before with mesh.access(var): var.data[...] = values # After var.array[...] = values
Update multi-variable operations:
# Before with mesh.access(var1, var2, var3): var1.data[...] = values1 var2.data[...] = values2 var3.data[...] = values3 # After with uw.synchronised_array_update(): var1.array[...] = values1 var2.array[...] = values2 var3.array[...] = values3
Shape adjustments (if using data property):
# data property: flat (N, components) var.data[:, 0] = scalars # array property: shaped (N, a, b) var.array[:, 0, 0] = scalars
Compatibility Layer¶
The legacy mesh.access() method remains available but now acts as a no-op dummy wrapper. It defers callbacks but does not manage PETSc access. This provides compatibility for old code while allowing gradual migration.
The data property continues to work, providing flat array views with automatic synchronization. It exists solely for backward compatibility - new code should use array.
Deprecated Methods¶
The following methods exist only for compatibility with migration test scaffolding and should not be used:
variable.use_legacy_array()- No-op placeholdervariable.use_enhanced_array()- No-op placeholder
These will be removed in a future cleanup phase.