Skip to content

Conversation

@conraddelgado
Copy link

@conraddelgado conraddelgado commented Nov 13, 2025

User description

User description

Description

Please include a summary of the changes and the related issue(s) if they exist.
Please also include relevant motivation and context.

Fixes #(issue) [optional]
-Overarching goal of investigating unclosed terms present in the volume filtered Euler-Lagrange framework
-Implements explicit computation of the unclosed terms present in the volume filtered momentum equation in the Euler-Lagrange framework for particle laden flows
-Explicit immersed boundary drag force computation
-General field statistics computation (1st-4th order statistical moments)
-Periodic spherical immersed boundaries
-Bulk flow periodic forcing

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Something else

Scope

  • This PR comprises a set of related changes with a common goal

If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.
Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration

  • Test A
  • Test B

Test Configuration:

  • What computers and compilers did you use to test this:
    -MacOs
    -Linux (Ubuntu)
    -NCSA Delta Nvidia A100 GPUs

Checklist

  • I have added comments for the new code
  • I added Doxygen docstrings to the new code
  • I have made corresponding changes to the documentation (docs/)
  • I have added regression tests to the test suite so that people can verify in the future that the feature is behaving as expected
  • I have added example cases in examples/ that demonstrate my new feature performing as expected.
    They run to completion and demonstrate "interesting physics"
  • I ran ./mfc.sh format before committing my code
  • New and existing tests pass locally with my changes, including with GPU capability enabled (both NVIDIA hardware with NVHPC compilers and AMD hardware with CRAY compilers) and disabled
  • This PR does not introduce any repeated code (it follows the DRY principle)
  • I cannot think of a way to condense this code and reduce any introduced additional line count

If your code changes any code source files (anything in src/simulation)

To make sure the code is performing as expected on GPU devices, I have:

  • Checked that the code compiles using NVHPC compilers
  • Checked that the code compiles using CRAY compilers
  • Ran the code on either V100, A100, or H100 GPUs and ensured the new feature performed as expected (the GPU results match the CPU results)
  • Ran the code on MI200+ GPUs and ensure the new features performed as expected (the GPU results match the CPU results)
  • Enclosed the new feature via nvtx ranges so that they can be identified in profiles
  • Ran a Nsight Systems profile using ./mfc.sh run XXXX --gpu -t simulation --nsys, and have attached the output file (.nsys-rep) and plain text results to this PR
  • Ran a Rocprof Systems profile using ./mfc.sh run XXXX --gpu -t simulation --rsys --hip-trace, and have attached the output file and plain text results to this PR.
  • Ran my code using various numbers of different GPUs (1, 2, and 8, for example) in parallel and made sure that the results scale similarly to what happens if you run without the new code/feature

PR Type

Enhancement, Tests

Description

  • Implements comprehensive volume filtering framework for particle-laden flows with explicit computation of unclosed terms in the Euler-Lagrange framework

  • Adds FFT-based filtering infrastructure supporting both CPU (FFTW) and GPU (cuFFT) backends with MPI parallelization

  • Computes key quantities including Reynolds stress tensors, effective viscosity, and interphase momentum exchange terms

  • Implements periodic spherical immersed boundaries with on-the-fly levelset computation when storage is disabled

  • Adds periodic bulk flow forcing based on space-time averaged quantities with fluid indicator function weighting

  • Extends MPI communication infrastructure with scalar field send/receive buffers and global domain bounds computation

  • Implements statistical moment computation (1st-4th order) using Welford's online algorithm with GPU parallelization

  • Adds support for writing volume-filtered conservative variables, pressure, Reynolds stress, effective viscosity, and interphase momentum exchange

  • Includes 5 new test cases demonstrating volume filtering, periodic forcing, and particle drag computation

  • Increases maximum immersed boundary patches from 10 to 1000 to support larger particle-laden systems

  • Adds comprehensive case validation for volume filtering and periodic forcing requirements

  • Includes utility scripts for generating 3D Voronoi-based sphere distributions for uniform particle packing

Diagram Walkthrough

flowchart LR
  A["Particle-laden Flow Simulation"] --> B["Volume Filtering Module"]
  A --> C["Periodic Forcing"]
  A --> D["Periodic Immersed Boundaries"]
  B --> E["FFT-based Filtering<br/>CPU/GPU"]
  B --> F["Statistical Moments<br/>1st-4th Order"]
  E --> G["Reynolds Stress<br/>Effective Viscosity"]
  F --> G
  C --> H["Space-time Averaging<br/>with Fluid Indicator"]
  D --> I["On-the-fly Levelset<br/>Computation"]
  G --> J["Filtered Data Output"]
  H --> J
  I --> J
  K["MPI Communication"] --> B
  K --> C
  K --> D
Loading

File Walkthrough

Relevant files
Enhancement
26 files
m_volume_filtering.fpp
Volume filtering module for Euler-Lagrange particle-laden flows

src/simulation/m_volume_filtering.fpp

  • Implements a comprehensive volume filtering module for particle-laden
    flows with explicit computation of unclosed terms in the
    Euler-Lagrange framework
  • Provides FFT-based filtering infrastructure supporting both CPU (FFTW)
    and GPU (cuFFT) backends with MPI parallelization
  • Computes key quantities including Reynolds stress tensors, effective
    viscosity, and interphase momentum exchange terms
  • Includes particle force computation and fluid indicator function
    management for immersed boundary methods
+2007/-0
m_data_output.fpp
Add volume-filtered data output capabilities to I/O module

src/simulation/m_data_output.fpp

  • Extends s_write_data_files and s_write_parallel_data_files subroutines
    with optional parameters for filtered quantities
  • Adds support for writing volume-filtered conservative variables,
    pressure, Reynolds stress, effective viscosity, and interphase
    momentum exchange
  • Implements conditional logic to handle filtered data output at initial
    and final time steps when q_filtered_wrt flag is enabled
  • Integrates s_initialize_mpi_data_filtered for MPI-based initialization
    of filtered field data structures
+45/-3   
m_time_steppers.fpp
Add volume filtering module dependencies to time steppers

src/simulation/m_time_steppers.fpp

  • Adds module dependencies for m_volume_filtering and
    m_compute_statistics
  • Enables integration of volume filtering and statistical computation
    capabilities into time-stepping routines
+4/-0     
m_mpi_common.fpp
MPI scalar field communication and global domain bounds   

src/common/m_mpi_common.fpp

  • Added send/receive buffers for scalar field communication with GPU
    memory management
  • Introduced domain_glb array to store global domain bounds across all
    processors
  • Implemented s_mpi_sendrecv_variables_buffers_scalarfield subroutine
    for scalar field MPI communication with pack/unpack operations
  • Added s_mpi_global_domain_bounds subroutine to compute and broadcast
    global domain boundaries
  • Added slab domain decomposition option for z-direction processor
    distribution
+393/-6 
m_start_up.fpp
Volume filtering and statistics computation integration   

src/simulation/m_start_up.fpp

  • Added module imports for volume filtering, statistics computation, and
    additional forcing
  • Conditionally read levelset data only when store_levelset flag is
    enabled
  • Added domain boundary initialization for x, y, z directions
  • Integrated volume filtering momentum equation computation and
    statistics calculation in main time-stepping loop
  • Added periodic forcing computation and particle drag force calculation
  • Implemented filtered variable output and MPI initialization for
    filtered data
  • Added initialization and finalization of volume filtering and
    statistics modules
+170/-81
m_compute_statistics.fpp
Statistical moments computation for volume filtered variables

src/simulation/m_compute_statistics.fpp

  • New module implementing statistical moment computation (1st-4th order)
    for unclosed terms
  • Allocates and manages arrays for Reynolds stress, effective viscosity,
    interphase momentum exchange, and filtered conservative variables
  • Implements Welford's online algorithm for computing statistical
    moments with GPU parallelization
  • Provides initialization and finalization routines for statistics
    module
+340/-0 
m_ibm.fpp
Periodic spherical immersed boundaries and on-the-fly levelset

src/simulation/m_ibm.fpp

  • Added periodic boundary population for immersed boundary markers in
    all three directions
  • Implemented on-the-fly levelset and levelset normal computation for
    spherical boundaries when store_levelset is disabled
  • Added periodic image point projection logic for spheres near domain
    boundaries
  • Modified ghost point detection to use directional subsections instead
    of full 3D subsection
  • Fixed bound checking in image point location search
+186/-18
m_data_input.f90
Post-processing support for filtered variables and statistics

src/post_process/m_data_input.f90

  • Added public declarations for filtered fluid indicator function and
    statistical moment arrays
  • Implemented s_allocate_filtered_arrays subroutine to allocate filtered
    variable arrays
  • Added s_read_parallel_filtered_data subroutine to read volume filtered
    data from restart files
  • Integrated filtered data reading into parallel data input pipeline
  • Added deallocation logic for filtered variable arrays in finalization
    routine
+193/-1 
m_additional_forcing.fpp
Periodic bulk flow forcing for particle-laden flows           

src/simulation/m_additional_forcing.fpp

  • New module implementing periodic bulk flow forcing based on space-time
    averaged quantities
  • Computes spatial averages of density, momentum, and energy with fluid
    indicator function weighting
  • Implements time-windowed averaging for phase-averaged quantities
  • Applies forcing terms to continuity, momentum, and energy equations
  • Includes optional output of forcing statistics to binary file
+204/-0 
m_boundary_common.fpp
Scalar field boundary condition population                             

src/common/m_boundary_common.fpp

  • Added s_populate_scalarfield_buffers subroutine for populating scalar
    field boundary buffers
  • Implemented s_periodic_scalarfield pure subroutine for periodic
    boundary conditions on scalar fields
  • Supports MPI communication and local periodic boundary application for
    scalar fields
+165/-0 
m_ib_patches.fpp
Periodic sphere initialization with domain wrapping           

src/common/m_ib_patches.fpp

  • Modified sphere initialization to support periodic boundary wrapping
    of sphere centroids
  • Added logic to check all periodic projections of sphere center and
    mark grid cells accordingly
  • Changed center storage from 1D to 2D array to accommodate periodic
    projections
  • Pre-computed radius squared for efficiency in distance calculations
+79/-24 
m_compute_levelset.fpp
Periodic levelset computation for spherical boundaries     

src/common/m_compute_levelset.fpp

  • Extended levelset computation to handle periodic boundary conditions
    for spheres
  • Added logic to compute periodic projections of sphere centers near
    domain boundaries
  • Implemented distance calculation to all periodic projections to find
    nearest surface
  • Conditional branching for periodic vs non-periodic sphere levelset
    computation
+91/-22 
m_mpi_proxy.fpp
MPI parameter broadcasting for new features                           

src/simulation/m_mpi_proxy.fpp

  • Added MPI broadcast for new integer parameters: t_step_start_stats,
    mom_f_idx, forcing_window
  • Added MPI broadcast for new logical parameters: periodic_ibs,
    compute_particle_drag, periodic_forcing, forcing_wrt,
    volume_filtering_momentum_eqn, store_levelset,
    slab_domain_decomposition, q_filtered_wrt
  • Added MPI broadcast for new real parameters: u_inf_ref, rho_inf_ref,
    P_inf_ref, filter_width, forcing_dt, fluid_volume_fraction
+7/-3     
m_start_up.fpp
Pre-processing domain bounds initialization                           

src/pre_process/m_start_up.fpp

  • Added call to s_mpi_global_domain_bounds to compute global domain
    bounds during preprocessing
  • Added module variable declarations for periodic immersed boundaries
    and levelset storage flags
+5/-2     
m_global_parameters.fpp
Global parameters for volume filtering and periodic forcing

src/simulation/m_global_parameters.fpp

  • Added 20 new global parameters for volume filtering, periodic forcing,
    and immersed boundary features
  • Initialized new parameters with default values in
    s_assign_default_values_to_user_inputs
  • Extended MPI I/O data allocation to handle 109 additional variables
    for volume filtering
  • Added GPU memory declarations for key filtering and forcing parameters
+50/-0   
m_start_up.fpp
Post-processing output for filtered quantities and statistics

src/post_process/m_start_up.fpp

  • Added filtered quantities output when q_filtered_wrt is enabled
  • Writes filtered fluid indicator function and statistical moments
    (Reynolds stress, effective viscosity, interfacial momentum exchange,
    filtered conserved variables, filtered pressure)
  • Computes 4th-order statistical moments for each quantity
+61/-1   
m_data_output.fpp
Conditional levelset storage in restart files                       

src/pre_process/m_data_output.fpp

  • Made levelset output conditional based on store_levelset flag
  • Fixed MPI I/O data type specifications for levelset and levelset norm
    writing
  • Wrapped levelset file operations in conditional block to skip when
    store_levelset is false
+34/-32 
m_global_parameters.fpp
Post-process global parameters for filtering and IB features

src/post_process/m_global_parameters.fpp

  • Added 4 new logical parameters for periodic immersed boundaries,
    levelset storage, domain decomposition, and filtered output
  • Initialized new parameters with appropriate default values
  • Extended MPI I/O data allocation for filtered quantities when
    q_filtered_wrt is enabled
+33/-9   
m_mpi_proxy.fpp
MPI broadcast for new immersed boundary parameters             

src/pre_process/m_mpi_proxy.fpp

  • Added 3 new logical parameters to MPI broadcast list: periodic_ibs,
    store_levelset, slab_domain_decomposition
  • Reordered parameter list for consistency
+2/-1     
m_mpi_proxy.fpp
MPI broadcast for post-process filtering parameters           

src/post_process/m_mpi_proxy.fpp

  • Added 4 new logical parameters to MPI broadcast: periodic_ibs,
    store_levelset, slab_domain_decomposition, q_filtered_wrt
+2/-1     
m_rhs.fpp
Periodic forcing integration into RHS computation               

src/simulation/m_rhs.fpp

  • Added module import for m_additional_forcing
  • Integrated periodic forcing computation into RHS calculation with NVTX
    profiling ranges
  • Calls s_compute_periodic_forcing when periodic_forcing is enabled
+8/-0     
m_global_parameters.fpp
Pre-process global parameters for periodic IB support       

src/pre_process/m_global_parameters.fpp

  • Added 3 new logical parameters for periodic immersed boundaries and
    domain decomposition
  • Initialized parameters with default values in
    s_assign_default_values_to_user_inputs
+7/-0     
p_main.fpp
Volume filtering initialization in main simulation             

src/simulation/p_main.fpp

  • Added module import for m_volume_filtering
  • Initialized fluid indicator function for volume filtering and periodic
    forcing
  • Initialized filtering kernel and filtered fluid indicator function
    when volume filtering is enabled
  • Initialized fluid indicator gradient computation
+9/-0     
m_constants.fpp
Increased maximum patches limit for IB support                     

src/common/m_constants.fpp

  • Increased num_patches_max from 10 to 1000 to support larger numbers of
    immersed boundaries
+1/-1     
case_validator.py
Case validation for volume filtering and periodic forcing

toolchain/mfc/case_validator.py

  • Increased num_ibs validation limit from 10 to 1000
  • Added validation for periodic immersed boundaries requiring all
    periodic BCs
  • Added check_periodic_forcing method validating required parameters for
    periodic forcing
  • Added check_volume_filtering method validating filter width, domain
    decomposition, and grid requirements
  • Integrated new validation checks into validate_common method
+77/-1   
case_dicts.py
Case dictionary parameter definitions for new features     

toolchain/mfc/run/case_dicts.py

  • Added 3 new logical parameters to COMMON dictionary for periodic IB
    support
  • Increased IB patch range from 10 to 1000 in both PRE_PROCESS and
    SIMULATION
  • Added 15 new parameters to SIMULATION dictionary for volume filtering
    and periodic forcing
  • Added q_filtered_wrt to POST_PROCESS dictionary
+20/-2   
Configuration changes
1 files
delta.mako
GPU memory allocation increase for Delta system                   

toolchain/templates/delta.mako

  • Increased memory allocation from 208G to 240G for GPU-enabled jobs on
    NCSA Delta
+1/-1     
Tests
6 files
case.py
Test case for volume filtered particle-laden flow               

runs/phi01/case.py

  • New test case for particle-laden flow with volume filtering
  • Configures 190 spheres in periodic domain with Mach 0.8 flow
  • Enables periodic forcing, volume filtering, and particle drag
    computation
  • Sets up filtering kernel width and domain decomposition parameters
+171/-0 
case.py
Single sphere volume filtering test case                                 

runs/3d_1sphere_filtering/case.py

  • New test case for single sphere with volume filtering
  • Configures 63³ grid with Mach 1.2 flow at Re=1500
  • Enables periodic forcing, volume filtering, and filtered output
    writing
  • Sets up slab domain decomposition for filtering operations
+154/-0 
case.py
Periodic immersed boundary test case                                         

runs/3d_1sphere_periodic/case.py

  • New test case for periodic immersed boundary with single sphere
  • Configures 63³ grid with Mach 1.2 flow
  • Enables periodic forcing and periodic immersed boundaries
  • Uses slab domain decomposition
+153/-0 
case.py
Off-center periodic IB test configuration                               

runs/3d_periodic_ibs_test/off-center/case.py

  • Test case for off-center periodic immersed boundary
  • Sphere positioned at (15D, 15D, 15D) in periodic domain
  • Enables periodic forcing and periodic immersed boundaries
  • Uses slab domain decomposition
+146/-0 
case.py
Centered periodic IB test configuration                                   

runs/3d_periodic_ibs_test/centered/case.py

  • Test case for centered periodic immersed boundary
  • Sphere positioned at domain center (0, 0, 0)
  • Enables periodic forcing and periodic immersed boundaries
  • Uses slab domain decomposition
+146/-0 
case.py
Drag force computation test case                                                 

runs/3d_drag_test/case.py

  • Test case for explicit drag force computation on single sphere
  • Configures 99³ grid with Mach 1.2 flow at Re=1500
  • Enables particle drag computation with reference flow parameters
  • Uses ghost cell extrapolation boundary conditions
+144/-0 
Miscellaneous
3 files
gen_voronoi_3D.py
3D Voronoi sphere distribution generator                                 

voronoi/gen_voronoi_3D.py

  • New utility script for generating 3D Voronoi-based sphere
    distributions
  • Implements Lloyd relaxation algorithm for uniform sphere packing
  • Generates sphere locations for particle-laden flow test cases
  • Validates non-overlapping sphere placement
+98/-0   
gen_voronoi_2D.py
2D Voronoi circle distribution generator                                 

voronoi/gen_voronoi_2D.py

  • New utility script for generating 2D Voronoi-based circle
    distributions
  • Implements Lloyd relaxation algorithm for uniform circle packing
  • Visualizes initial and relaxed point distributions
+99/-0   
sphere_array_locations.txt
Sphere location data for phi=0.1 test case                             

runs/phi01/sphere_array_locations.txt

  • Data file containing 190 sphere center coordinates for particle-laden
    flow test case
  • Coordinates generated using Voronoi-based Lloyd relaxation for uniform
    distribution
+190/-0 

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for up to 1,000 immersed boundaries (increased from 10).
    • Implemented periodic immersed boundary functionality.
    • Added volume-based filtering for momentum equations with FFT acceleration.
    • Introduced periodic forcing for simulations with configurable parameters.
    • Added particle drag computation capability.
  • Enhancements

    • Extended MPI support for filtered data communication.
    • Improved levelset computation with conditional storage options.
    • Enhanced case validation for new features.
    • Added Voronoi-based point distribution tools.
  • New Examples & Test Cases

    • Added multiple 3D immersed boundary test configurations.

✏️ Tip: You can customize this high-level summary in your review settings.


CodeAnt-AI Description

Volume-filtered Euler–Lagrange: explicit unclosed-term computation, periodic IBM, and particle drag output

What Changed

  • Adds a volume-filtering module that computes explicit unclosed terms for the volume-filtered momentum equation (Reynolds stress, effective viscosity, interphase momentum exchange) and produces Favre-filtered conserved/primitive fields
  • Implements FFT-based Gaussian filtering across distributed slabs (with batched MPI transposes) and a normalized filtering kernel used across CPU/GPU builds
  • Adds fluid indicator, its gradient, and computation of forces on periodic immersed spherical boundaries; per-particle drag forces are accumulated, reduced across ranks, and written to a binary file
  • Exposes new global runtime options (periodic immersed boundaries, compute particle drag, periodic forcing, filter width, volume-filter momentum option) and integrates filtered statistics into MPI I/O views when requested
  • Fixes MPI/GPU buffer handling and several FFT/transpose edge cases to support periodic IBs and batched tensor transfers

Impact

✅ Clearer particle drag outputs
✅ Volume-filtered momentum terms available for analysis
✅ Support for periodic immersed boundaries and periodic forcing

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 21, 2026

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • Division by zero risk
    Several places divide by filtered_fluid_indicator_function%sf(...) (and by q_cons_filtered(1)%sf(...) for density). If the indicator (or density) is zero or very small where there is no fluid, these divisions will produce NaNs/Infs or propagate invalid values. The code needs explicit guarding or thresholds before performing those divides.

  • Density divide guard
    In the pseudo-turbulent Reynolds stress subtraction you divide by q_cons_filtered(1)%sf(...) (the filtered density). If that value is zero (or nearly zero) in any cell this will produce invalid values. Validate or clamp density before division.

  • Repeated normalization / performance
    The code repeatedly computes real(Nx*Ny*Nz, wp) inside inner loops and does direct divisions by that expression. Recomputing conversion and performing division in inner loops is wasteful and repeated. Precompute inverse normalization and reuse to reduce overhead and improve numerical consistency.

  • Uninitialized data transfer
    The GPU loop copyin uses the whole center array (dimension 3x2) even when periodic_ibs is false and only center(:,1) is initialized. This can lead to copying uninitialized memory to the device and nondeterministic behavior. Either initialize center(:,2) by default or reduce the copyin to only the used slice.

  • GPU private clause
    Several new GPU-parallel loops in s_populate_scalarfield_buffers use the $:GPU_PARALLEL_LOOP(collapse=2) pragma without specifying private variables (other similar loops in the file include private='[l,k]'). Missing private lists can lead to incorrect behavior or race conditions on the device or at least inconsistent code-generation across backends. Verify and align the pragmas with existing patterns.

  • Division by zero
    The energy forcing adds a term that divides by q_cons_vf(1)%sf(i,j,k) (cell density). If a cell density can be zero or extremely small (e.g., inside solids / masked cells), this will produce invalid values or blow up numerically. Add a safe-guard or conditional to avoid division by zero / tiny numbers.

  • Intent mismatch
    The subroutine declares rhs_vf with intent(in) but later updates its contents (adds forcing into rhs_vf). This is an interface/logic mismatch and can lead to incorrect compiler assumptions or unexpected behavior. Verify the intended intent (should likely be intent(inout)) and update the declaration accordingly.

  • File I/O robustness
    The module opens a binary stream on unit 102 at initialization and closes at finalization. Ensure the unit is not already in use, check for successful open, and consider error handling if the file cannot be created (particularly in parallel runs where only rank 0 opens the file).

  • Tetrahedralization correctness
    The routine that builds tetrahedra for each Voronoi cell uses a simple "fan" of consecutive vertex triplets (j, j+1, j+2). For arbitrary 3D Voronoi polytopes this may produce degenerate or incorrect tetrahedra (ordering/convexity assumptions). Validate the approach or use a robust 3D polyhedron tetrahedralization method.

  • Wrong Voronoi input
    Inside the Lloyd iterations, the Voronoi diagram is recomputed using initial_points rather than the current, updated points. This prevents the relaxation from converging correctly because the tessellation doesn't reflect the latest point positions.

  • Division by zero risk
    The centroid normalization divides by np.sum(simplex_areas) without guarding against a zero sum (degenerate or tiny/collinear triangles). This can raise an error or produce NaNs; a safe check or fallback is needed.

  • Timestamp / timestep mismatch
    The code builds the time axis for the MFC induction-time search from sorted(case.get_timestamps()) but builds the corresponding Y/rho arrays by iterating sorted(case.get_timesteps()). If timestamps and timesteps are different sequences (or have different ordering/length), zipping them in find_induction_time may misalign times vs. concentrations and produce incorrect induction times.

  • Time stepping overshoot risk
    generate_ct_saves advances the ReactorNet to ct_time + time_save while looping with ct_time < Tend, then increments ct_time by time_save. If Tend is not an integer multiple of time_save this may advance beyond Tend in the final iteration. Consider clamping the next advance time to Tend to avoid stepping past the intended final time.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 21, 2026

CodeAnt AI finished reviewing your PR.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@voronoi/gen_voronoi_3D.py`:
- Around line 28-37: The current loop over points builds tetrahedra by taking
consecutive vertex triples (in the block that constructs tetrahedra then
computes centroids via compute_tetrahedron_centroid and volumes via
compute_tetrahedron_volume), which can produce invalid tetrahedra; replace that
tetrahedralization with a proper Delaunay decomposition of the vertex set (use
scipy.spatial.Delaunay on vertices[i] to get tetrahedra indices and then compute
centroids/volumes from those tetrahedra), or simpler and preferred, use freud's
precomputed values by replacing the per-cell centroid/volume computation with
voro_data.centroid[i] and voro_data.volumes[i] to get correct weighted centroids
and volumes. Ensure you remove the consecutive-triple logic and adapt downstream
code to consume the Delaunay-based tetrahedra or voro_data fields.
♻️ Duplicate comments (2)
voronoi/gen_voronoi_3D.py (2)

20-22: Avoid mutating the caller’s input array.

points = initial_points aliases the input, so the relaxation mutates initial_points and the “initial points” plot becomes incorrect. Copy the array before iterating.

🛠️ Suggested fix
-    points = initial_points
+    points = np.array(initial_points, copy=True)

61-63: Create the output directory recursively.

os.mkdir fails if ../runs doesn’t exist. Use os.makedirs(..., exist_ok=True) instead.

🛠️ Suggested fix
-    if os.path.exists(output_dir) == False:
-        os.mkdir(output_dir)
+    os.makedirs(output_dir, exist_ok=True)
🧹 Nitpick comments (2)
examples/3D_random_sphere_array/case.py (2)

1-3: Unused import: math

The math module is imported but never used. All mathematical operations use numpy instead.

Suggested fix
 import json
-import math
 import numpy as np

36-38: Consider adding basic validation for the sphere location file.

If sphere_array_locations.txt is missing or has fewer than 3 columns, the script will fail with an unclear error when accessing sphere_loc[i, 0:3] in the loop below. A brief check could improve usability:

Suggested validation
 # load initial sphere locations
-sphere_loc = np.loadtxt("sphere_array_locations.txt")
+sphere_loc = np.loadtxt("sphere_array_locations.txt", ndmin=2)
+if sphere_loc.ndim != 2 or sphere_loc.shape[1] < 3:
+    raise ValueError("sphere_array_locations.txt must contain at least 3 columns (x, y, z)")
 N_sphere = len(sphere_loc)

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 22, 2026

CodeAnt AI is running Incremental review


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added size:XXL This PR changes 1000+ lines, ignoring generated files and removed size:XXL This PR changes 1000+ lines, ignoring generated files labels Jan 22, 2026
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 23, 2026

CodeAnt AI Incremental review completed.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 19 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/post_process/m_global_parameters.fpp">

<violation number="1" location="src/post_process/m_global_parameters.fpp:866">
P1: volume_filter_size is only computed for model_eqns == 2, but it is used for MPI allocations whenever q_filtered_wrt is true. If q_filtered_wrt is enabled with a different model, this uses an uninitialized size and can trigger invalid allocations.</violation>
</file>

<file name="src/simulation/m_global_parameters.fpp">

<violation number="1" location="src/simulation/m_global_parameters.fpp:1214">
P1: volume_filter_size is only computed for model_eqns == 2, but it is used for allocations whenever q_filtered_wrt is enabled. For other model_eqns values this leaves volume_filter_size uninitialized and can cause invalid allocation sizes at runtime.</violation>
</file>

<file name="src/simulation/m_additional_forcing.fpp">

<violation number="1" location="src/simulation/m_additional_forcing.fpp:168">
P2: Use total density for the energy forcing denominator. Dividing by only the first continuity field breaks multi-fluid cases and can cause divide-by-zero when that phase is absent.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/simulation/m_start_up.fpp (1)

1279-1352: Guard filtered-stat I/O to avoid uninitialized buffers.

q_filtered_wrt can be enabled without volume_filter_momentum_eqn, yet the save path updates/writes stat_* arrays regardless. That can dereference uninitialized/unallocated buffers or write garbage when t_step < t_step_start_stats. Gate the filtered I/O on volume_filter_momentum_eqn (and the stats start step).

🐛 Proposed fix
-        if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
+        if (q_filtered_wrt .and. volume_filter_momentum_eqn .and. &
+            t_step >= t_step_start_stats .and. (t_step == 0 .or. t_step == t_step_stop)) then
             $:GPU_UPDATE(host='[filtered_fluid_indicator_function%sf]')
             do i = 1, num_dims
                 do j = 1, num_dims
                     do k = 1, 4
                         $:GPU_UPDATE(host='[stat_reynolds_stress(i, j)%vf(k)%sf]')
                         $:GPU_UPDATE(host='[stat_eff_visc(i, j)%vf(k)%sf]')
                     end do
                 end do
             end do
             do i = 1, num_dims
                 do j = 1, 4
                     $:GPU_UPDATE(host='[stat_int_mom_exch(i)%vf(j)%sf]')
                 end do
             end do
             do i = 1, E_idx
                 do j = 1, 4
                     $:GPU_UPDATE(host='[stat_q_cons_filtered(i)%vf(j)%sf]')
                 end do
             end do
             do i = 1, 4
                 $:GPU_UPDATE(host='[stat_filtered_pressure(i)%sf]')
             end do
         end if
@@
-        else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
+        else if (q_filtered_wrt .and. volume_filter_momentum_eqn .and. &
+                 t_step >= t_step_start_stats .and. (t_step == 0 .or. t_step == t_step_stop)) then
             call s_write_data_files(q_cons_ts(1)%vf, q_T_sf, q_prim_vf, save_count, bc_type, &
                                     filtered_fluid_indicator_function=filtered_fluid_indicator_function, &
                                     stat_q_cons_filtered=stat_q_cons_filtered, stat_filtered_pressure=stat_filtered_pressure, &
                                     stat_reynolds_stress=stat_reynolds_stress, stat_eff_visc=stat_eff_visc, stat_int_mom_exch=stat_int_mom_exch)
🤖 Fix all issues with AI agents
In `@src/common/m_mpi_common.fpp`:
- Around line 346-389: The code uses hardcoded index offsets when assigning
MPI_IO_DATA%var (e.g., sys_size + 1, +37, +73, +85, +105) which is fragile and
error-prone; replace those magic numbers by computing named offsets from
sys_size (or define constants) based on the sizes of the preceding blocks
(referencing filtered_fluid_indicator_function%sf, stat_reynolds_stress,
stat_eff_visc, stat_int_mom_exch, stat_q_cons_filtered, stat_filtered_pressure)
so the indexing is explicit and maintainable, and fix the view-creation loop to
start at sys_size + 1 (change do i = sys_size, alt_sys to do i = sys_size + 1,
alt_sys) to avoid overwriting the last base variable's view for
MPI_IO_DATA%view.

In `@src/post_process/m_data_input.f90`:
- Around line 243-301: The long Allocate statements in subroutine
s_allocate_filtered_arrays (e.g., allocations for
filtered_fluid_indicator_function%sf, stat_q_cons_filtered%vf(j)%sf,
stat_reynolds_stress%vf(k)%sf, stat_eff_visc%vf(k)%sf,
stat_int_mom_exch%vf(j)%sf) exceed line-length limits; break each long
allocate(...) line into multiple lines using Fortran continuation (&) after
commas or before the continued token, indent continuations for readability, and
ensure parentheses and array bounds (local_start_idx:end_x,
local_start_idx:end_y, local_start_idx:end_z) remain intact on the continued
lines so the allocate calls compile cleanly and satisfy Fortitude.

In `@src/simulation/m_volume_filtering.fpp`:
- Around line 214-216: The abort message incorrectly references "p" while the
check uses mod(Ny, num_procs); update the s_mpi_abort call in this conditional
(the call to s_mpi_abort inside the block that checks mod(Ny, num_procs) /= 0)
so the message refers to Ny (or "y-dimension" / "number of y points") being
divisible by num_procs rather than "p", e.g. change the error string to mention
Ny/rows or y-dimension divisibility by num_procs.

In `@toolchain/mfc/case_validator.py`:
- Around line 1104-1192: Strip all trailing whitespace characters from the
affected block (inside the methods check_periodic_forcing and
check_volume_filtering) so no lines end with extra spaces; edit the lines in
those functions' bodies and docstrings to remove trailing spaces and re-run lint
to confirm the Pylint trailing-whitespace warnings are resolved.
♻️ Duplicate comments (9)
src/simulation/m_data_output.fpp (3)

1018-1054: Avoid writing the beta slot twice in the filtered MPI loop.
When q_filtered_wrt is true and beta is present, the loop writes sys_size+1 and then the “Correction” block writes it again, risking overlap/corruption.

🐛 Suggested fix
            else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
                do i = 1, alt_sys
+                    if (present(beta) .and. i == sys_size + 1) cycle
                     var_MOK = int(i, MPI_OFFSET_KIND)
                     disp = m_MOK*max(MOK, n_MOK)*max(MOK, p_MOK)*WP_MOK*(var_MOK - 1)
                     call MPI_FILE_SET_VIEW(ifile, disp, mpi_p, MPI_IO_DATA%view(i), &
                                            'native', mpi_info_int, ierr)
                     call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf, data_size*mpi_io_type, &
                                             mpi_io_p, status, ierr)
                end do

117-123: Serial path drops filtered outputs.
When parallel_io is false, the filtered/statistics arguments are ignored, so q_filtered_wrt cannot produce outputs in serial runs. Please extend the serial path or explicitly reject this configuration.

💡 Suggested fix (also update s_write_serial_data_files signature)
-        if (.not. parallel_io) then
-            call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta)
-        else
+        if (.not. parallel_io) then
+            if (q_filtered_wrt .and. present(filtered_fluid_indicator_function)) then
+                call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta, &
+                                               filtered_fluid_indicator_function, &
+                                               stat_q_cons_filtered, stat_filtered_pressure, &
+                                               stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
+            else
+                call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta)
+            end if
+        else
             call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta, &
                                              filtered_fluid_indicator_function, &
                                              stat_q_cons_filtered, stat_filtered_pressure, &
                                              stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
         end if

957-963: Filtered MPI buffers can remain uninitialized.
s_initialize_mpi_data_filtered is only called under ib, so q_filtered_wrt with ib = .false. (or file-per-process) leaves filtered views/buffers uninitialized. Move the call outside the ib branch or assert unsupported combinations.

src/simulation/m_volume_filtering.fpp (3)

585-594: Finite-difference stencil can read out of bounds.
The loop runs i=0..m, j=0..n, k=0..p while indexing i+l with l=-4..4, which can access beyond bounds near edges. Restrict loop bounds or use buffered indices.

🐛 Suggested fix (bounds-safe loop)
-        $:GPU_PARALLEL_LOOP(collapse=3, copyin='[fd_coeffs]')
-        do i = 0, m
-            do j = 0, n
-                do k = 0, p
+        $:GPU_PARALLEL_LOOP(collapse=3, copyin='[fd_coeffs]')
+        do i = 4, m-4
+            do j = 4, n-4
+                do k = 4, p-4
                     $:GPU_LOOP(parallelism='[seq]')
                     do l = -4, 4
                         grad_fluid_indicator(1)%sf(i, j, k) = grad_fluid_indicator(1)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i + l, j, k)/dx(i)
                         grad_fluid_indicator(2)%sf(i, j, k) = grad_fluid_indicator(2)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i, j + l, k)/dy(j)
                         grad_fluid_indicator(3)%sf(i, j, k) = grad_fluid_indicator(3)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i, j, k + l)/dz(k)
                     end do
                 end do
             end do
         end do

785-786: Guard against zero filtered density.
q_cons_filtered(1)%sf can be zero inside particles, causing division-by-zero/NaNs in Reynolds stress.

🐛 Suggested fix
+        real(wp), parameter :: eps_rho = 1.0e-12_wp
...
-                            reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) &
-                                                                - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k)/q_cons_filtered(1)%sf(i, j, k))
+                            reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) &
+                                                                - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k) / &
+                                                                   max(q_cons_filtered(1)%sf(i, j, k), eps_rho))

1553-1560: Filter back-transform divides by near-zero fluid indicator.
Several assignments divide by filtered_fluid_indicator_function%sf, which can be tiny near/inside particles. Add an epsilon floor (or conditional zeroing) for stability.

🐛 Suggested fix (apply to all similar divisions)
+        real(wp), parameter :: eps_phi = 1.0e-12_wp
+        real(wp) :: denom
...
-                        q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm/filtered_fluid_indicator_function%sf(i, j, k)
+                        denom = max(filtered_fluid_indicator_function%sf(i, j, k), eps_phi)
+                        q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm/denom

Also applies to: 1602-1607, 1652-1656, 1704-1708

src/simulation/p_main.fpp (1)

59-64: Ensure volume-filtering resources are finalized.
Initialization is added here, but the matching finalization calls are not visible. Please confirm these are invoked in s_finalize_modules() when volume_filter_momentum_eqn is enabled.

#!/bin/bash
# Verify volume-filtering finalizers are called in shutdown paths.
rg -n "s_finalize_.*filter|s_finalize_fftw_explicit_filter_module" src/simulation
rg -n "volume_filter_momentum_eqn" src/simulation/m_start_up.fpp src/simulation/p_main.fpp
src/common/m_ib_patches.fpp (1)

617-633: Avoid shared module state inside the GPU loop (thread-safety).

s_convert_cylindrical_to_cartesian_coord writes module-level cart_y/cart_z inside a parallel region. This is the same race previously flagged; please make the conversion thread‑local (e.g., using f_convert_cyl_to_cart and local temps) instead of module variables.

src/common/m_mpi_common.fpp (1)

1604-1613: Slab decomposition validation relies on generic error message.

The validation for slab decomposition divisibility is present via the ierr flag check at lines 1731-1734, but the error message is generic ("Unsupported combination of values..."). Consider adding a more specific error message for the slab decomposition case to help users diagnose the issue. This is a duplicate of a past review suggestion.

🧹 Nitpick comments (6)
src/simulation/m_data_output.fpp (2)

85-115: Reduce argument count in s_write_data_files.
This signature now exceeds the ≤6-argument guideline; consider a derived-type params struct for the filtered/statistics payload. As per coding guidelines, ...


801-814: Reduce argument count in s_write_parallel_data_files.
This signature exceeds the ≤6-argument guideline; bundle the optional filtered/statistics fields into a derived-type params struct. As per coding guidelines, ...

src/simulation/m_volume_filtering.fpp (1)

721-729: Trim parameter lists for filtering routines.
s_setup_terms_filtering and s_filter_batch exceed the ≤6-argument guideline; wrap these inputs into a derived-type params object for maintainability. As per coding guidelines, ...

Also applies to: 1037-1046

src/post_process/m_data_input.f90 (1)

664-669: Reduce argument count via a params struct.

s_read_parallel_filtered_data takes 8 arguments, exceeding the project guideline (≤6). Consider a small derived type (e.g., type(mpi_io_params)) to pass the MPI offsets as a bundle.

As per coding guidelines, keep routine arguments ≤ 6 by grouping related parameters.

toolchain/mfc/run/case_dicts.py (1)

118-143: Large parameter dictionary expansion for IB patches.

Expanding ib_id range from 10 to 1000 generates thousands of dictionary entries. While this aligns with the PR objective to support up to 1000 IB patches, consider the impact on:

  1. Schema validation performance (fastjsonschema compilation)
  2. Memory footprint of the parameter dictionaries

This may be acceptable for the use case, but worth noting for future optimization if startup time becomes an issue.

src/simulation/m_global_parameters.fpp (1)

1048-1050: Consider documenting the volume_filter_size formula.

The formula sys_size + 1 + 4*(2*num_dims**2 + num_dims + E_idx + 1) accounts for multiple filtered quantities with 4 statistical moments each, but the calculation is not self-documenting. Consider adding a comment explaining the breakdown:

  • sys_size: base conservative variables
  • +1: filtered fluid indicator function
  • 4 * (...): 4 statistical moments for each filtered quantity

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/simulation/m_data_output.fpp (1)

85-116: Reduce argument count via a params struct.

This subroutine now exceeds the ≤6 argument guideline. Consider bundling the filtered/stat fields into a derived-type params struct (or module-level context) to keep the API stable and easier to evolve. As per coding guidelines, routine arguments should be ≤6.

🤖 Fix all issues with AI agents
In `@src/post_process/m_data_input.f90`:
- Around line 251-299: Several allocate(...) statements in
s_allocate_filtered_arrays (e.g. filtered_fluid_indicator_function%sf(...),
stat_q_cons_filtered(i)%vf(j)%sf(...), stat_reynolds_stress(i,j)%vf(k)%sf(...),
stat_eff_visc(i,j)%vf(k)%sf(...), stat_int_mom_exch(i)%vf(j)%sf(...)) exceed 100
chars and must be wrapped using Fortran continuation; split the long extent list
across lines with an ampersand at the end of the previous line and a leading
ampersand on the continued line so each allocate call stays under 100 chars,
preserving the full allocate(...) expression and indices exactly and keeping the
existing loop structure and indices intact.

In `@src/simulation/m_compute_statistics.fpp`:
- Around line 161-169: The subroutine
s_compute_statistics_momentum_unclosed_terms currently exceeds the ≤6-argument
guideline; create a derived type (e.g., type :: momentum_unclosed_params) that
bundles the unclosed-term fields (reynolds_stress, eff_visc, int_mom_exch,
q_cons_filtered, filtered_pressure) and replace those five arguments in
s_compute_statistics_momentum_unclosed_terms with a single intent(in) parameter
of that new type, keeping only the remaining scalar arguments (t_step,
t_step_start_stats) as separate arguments; update all call sites to construct
and pass the new params struct and adjust any internal references to use the
struct members.

In `@src/simulation/m_data_output.fpp`:
- Around line 801-814: The subroutine s_write_parallel_data_files currently
exceeds the ≤6-argument guideline; refactor by introducing a single derived type
(e.g. a params/holder type) to carry the optional filtered/stat fields
(stat_q_cons_filtered, stat_filtered_pressure, stat_reynolds_stress,
stat_eff_visc, stat_int_mom_exch and optionally
filtered_fluid_indicator_function and beta) and replace those separate optional
arguments with one optional instance of that derived type in the
s_write_parallel_data_files signature; update the type definition, adjust the
intent and optional attributes accordingly, and update all call sites to
construct/pass the new params object so the routine arguments are reduced to ≤6
while preserving intent(inout) semantics for the contained fields.

In `@src/simulation/m_global_parameters.fpp`:
- Around line 1048-1050: The computation of volume_filter_size is done too early
(when q_filtered_wrt is true) before additional contributions are added to
sys_size (surface tension, damage, chemistry), which can under-size MPI buffers;
move or recompute the volume_filter_size assignment so it runs after sys_size is
fully finalized (i.e., right before the MPI_IO_DATA allocation), keeping the
same conditional on q_filtered_wrt and using the final sys_size value; update
any duplicate early computations (also present around the other occurrence) to
the same post-finalization location to ensure buffers are sized correctly.

In `@src/simulation/m_start_up.fpp`:
- Around line 1152-1157: Before passing 1._wp/fluid_pp(1)%Re(1) into
s_volume_filter_momentum_eqn, validate that fluid_pp(1)%Re(1) is set and
strictly positive; if it is missing or <= 0, call s_mpi_abort with a clear error
message and stop execution. Update the block guarded by
volume_filter_momentum_eqn (the call to s_volume_filter_momentum_eqn) to perform
this check on fluid_pp(1)%Re(1) and only compute the reciprocal
1._wp/fluid_pp(1)%Re(1) when the value is valid; reference the symbols
fluid_pp(1)%Re(1), s_volume_filter_momentum_eqn and s_mpi_abort when making the
change.

In `@src/simulation/m_volume_filtering.fpp`:
- Around line 214-216: The abort message is misleading: the check compares Ny
against num_procs but the s_mpi_abort text refers to "p"; update the error text
to reference Ny (or "Ny" and "number of ranks") so it matches the condition.
Locate the conditional that uses Ny and num_procs in the routine (the if
statement and the call to s_mpi_abort) and change the message passed to
s_mpi_abort to something like "Volume filtering requires Ny to be divisible by
the number of ranks" (or similar phrasing referencing Ny) so logs accurately
reflect the checked dimension.

In `@toolchain/mfc/case_validator.py`:
- Around line 1104-1192: Trim all trailing whitespace characters from the newly
added validator methods check_periodic_forcing and check_volume_filtering (and
any adjacent lines in that diff); locate occurrences inside those functions
(including lines with calls to self.get(...), self.prohibit(...), and the
docstrings) and remove trailing spaces at line ends so the linter warning is
resolved.
♻️ Duplicate comments (18)
src/simulation/p_main.fpp (1)

25-64: Add finalization for volume-filtering resources.

The new initializers are invoked under volume_filter_momentum_eqn, but there’s no corresponding cleanup in the finalize path. Please add finalizers (or confirm s_finalize_modules already handles them) to avoid leaking large allocations.

src/common/m_mpi_common.fpp (1)

341-389: Fix alt_sys so filtered MPI views cover the full variable range.

alt_sys is set to volume_filter_size, so the view-creation loop can miss appended variables. It should be sys_size + volume_filter_size and the loop should start at sys_size + 1.

🐛 Proposed fix
-        alt_sys = volume_filter_size
+        alt_sys = sys_size + volume_filter_size
...
-        do i = sys_size, alt_sys
+        do i = sys_size + 1, alt_sys
src/post_process/m_data_input.f90 (1)

679-709: Fix alt_sys so MPI filtered reads include base + filtered variables.

alt_sys is set to volume_filter_size, so NVARS_MOK and the read loop undercount when base variables are present. It should be sys_size + volume_filter_size.

🐛 Proposed fix
-        alt_sys = volume_filter_size ! filtered indicator, stats of: R_u, R_mu, F_imet, q_cons_filtered, pressure
+        alt_sys = sys_size + volume_filter_size ! filtered indicator + stats appended to base vars
src/simulation/m_ibm.fpp (3)

620-630: Fix off-by-one in image-point search bounds.
index+1 is accessed, so allowing index==m+buff_size can read past the coordinate arrays.

🐛 Proposed fix
-                    bound = m + buff_size
+                    bound = m + buff_size - 1
...
-                    bound = n + buff_size
+                    bound = n + buff_size - 1
...
-                    bound = p + buff_size
+                    bound = p + buff_size - 1

648-651: Use s_mpi_abort instead of error stop.
error stop can hang MPI ranks; switch to the MPI-aware abort. As per coding guidelines, use s_mpi_abort for error termination.

🐛 Suggested fix
-                            print *, "Increase buff_size further in m_helper_basic (currently set to a minimum of 10)"
-                            error stop "Increase buff_size"
+                            call s_mpi_abort("Increase buff_size further in m_helper_basic (currently set to a minimum of 10)")

1316-1320: Deallocate levelset fields even when store_levelset=F.
They are allocated unconditionally in initialization, so conditional deallocation leaks memory. As per coding guidelines, use the deallocation macro.

🐛 Proposed fix
         @:DEALLOCATE(ib_markers%sf)
-        if (store_levelset) then
-            @:DEALLOCATE(levelset%sf)
-            @:DEALLOCATE(levelset_norm%sf)
-        end if
+        @:DEALLOCATE(levelset%sf)
+        @:DEALLOCATE(levelset_norm%sf)
src/post_process/m_global_parameters.fpp (2)

123-124: Ensure volume_filter_size is initialized for all q_filtered_wrt cases.
It’s computed only under model_eqns == 2, but used for allocation whenever q_filtered_wrt is true; if q_filtered_wrt is enabled for another model, the size is undefined. Consider computing it after sys_size is finalized for all models, or validate that q_filtered_wrt is only used with model_eqns=2.

♻️ Suggested placement (after sys_size is finalized)
-            if (q_filtered_wrt) then
-                volume_filter_size = sys_size + 1 + 4*(2*num_dims**2 + num_dims + E_idx + 1)
-            end if
...
+        if (q_filtered_wrt) then
+            volume_filter_size = sys_size + 1 + 4*(2*num_dims**2 + num_dims + E_idx + 1)
+        end if

Also applies to: 698-700, 865-868


1058-1062: Deallocate extra MPI_IO_DATA entries for filtered outputs.
Nulling pointers alone leaks the allocated sf buffers. As per coding guidelines, deallocate with the macro when associated.

🐛 Proposed fix
             if (q_filtered_wrt) then
                 do i = sys_size + 1, volume_filter_size
-                    MPI_IO_DATA%var(i)%sf => null()
+                    if (associated(MPI_IO_DATA%var(i)%sf)) then
+                        @:DEALLOCATE(MPI_IO_DATA%var(i)%sf)
+                    end if
                 end do
             end if
src/simulation/m_data_output.fpp (3)

117-123: Serial path drops filtered outputs when parallel_io is false.

q_filtered_wrt data never reaches s_write_serial_data_files, so filtered outputs are silently skipped in serial runs. This matches a prior finding.

🔧 Suggested fix
-        if (.not. parallel_io) then
-            call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta)
+        if (.not. parallel_io) then
+            if (q_filtered_wrt .and. present(filtered_fluid_indicator_function)) then
+                call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta, &
+                                               filtered_fluid_indicator_function, stat_q_cons_filtered, stat_filtered_pressure, &
+                                               stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
+            else
+                call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta)
+            end if
         else
             call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta, &
                                              filtered_fluid_indicator_function, &
                                              stat_q_cons_filtered, stat_filtered_pressure, &
                                              stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
         end if

957-963: Filtered MPI buffers are still only initialized inside the ib branch.

When q_filtered_wrt is true but ib is false (or when file_per_process is true), the filtered MPI buffers/views remain uninitialized. This mirrors a prior finding.


1018-1054: Potential duplicate write of beta slot when q_filtered_wrt is enabled.

The q_filtered_wrt loop writes indices 1..alt_sys, which includes sys_size+1 when present(beta) sets alt_sys = sys_size + 1, and then beta is written again in the correction block. This is a previously reported overlap risk.

src/simulation/m_additional_forcing.fpp (2)

111-115: Momentum-energy term still assumes 3 components.

momxb, momxb+1, momxb+2 hard-code a 3D layout and break 1D/2D or nonstandard layouts. This matches a prior review note.


166-168: Energy forcing denominator is single-fluid and unguarded.

Division by q_cons_vf(contxb) can be zero and ignores multi-fluid mixture density, consistent with earlier review findings.

src/simulation/m_global_parameters.fpp (1)

1048-1050: volume_filter_size only set in model_eqns==2.

If q_filtered_wrt is enabled under other equation sets, volume_filter_size can be uninitialized and lead to invalid allocations. This repeats a prior review finding.

src/simulation/m_compute_statistics.fpp (1)

239-246: Moments divide by zero when ns <= 3 or variance is zero.

The formulas can divide by ns-1/2/3 and by Msn(2) when the variance is zero. This matches a prior review finding.

src/simulation/m_volume_filtering.fpp (3)

585-594: Stencil access can go out of bounds at domain edges.
i+l, j+l, k+l will underflow/overflow near boundaries unless halos are ≥4. Restrict loop ranges or guard indices.


785-786: Guard division by filtered density.
q_cons_filtered(1)%sf(i,j,k) can be ~0 in solid regions, causing NaNs. Add a small floor.

🐛 Suggested pattern
-                            reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) &
-                                                                - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k)/q_cons_filtered(1)%sf(i, j, k))
+                            reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) &
+                                                                - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k) &
+                                                                   / max(q_cons_filtered(1)%sf(i, j, k), eps_rho))

1553-1557: Protect divisions by filtered_fluid_indicator_function.
The denominator can be 0 in solid cells, leading to NaNs for q_cons_filtered, filtered_pressure, reynolds_stress, and eff_visc. Add an epsilon or conditional guard.

🐛 Suggested pattern (apply to all four sites)
-                        q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm/filtered_fluid_indicator_function%sf(i, j, k)
+                        q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm / &
+                            max(filtered_fluid_indicator_function%sf(i, j, k), eps_phi)

Also applies to: 1602-1606, 1652-1656, 1704-1708

🧹 Nitpick comments (4)
toolchain/mfc/case_validator.py (1)

1104-1155: Reduce local variable count in check_periodic_forcing to satisfy pylint.
A small table-driven validation can drop several locals while keeping the same error messages.

♻️ Possible refactor sketch
-        u_inf_ref = self.get('u_inf_ref')
-        rho_inf_ref = self.get('rho_inf_ref')
-        P_inf_ref = self.get('P_inf_ref')
-        forcing_window = self.get('forcing_window')
-        forcing_dt = self.get('forcing_dt')
+        required = {
+            'u_inf_ref': "u_inf_ref (desired bulk velocity) must be specified for periodic_forcing",
+            'rho_inf_ref': "rho_inf_ref (desired bulk density) must be specified for periodic_forcing",
+            'P_inf_ref': "P_inf_ref (desired pressure based on bulk internal energy) must be specified for periodic_forcing",
+            'forcing_window': "forcing_window must be specified for periodic_forcing",
+            'forcing_dt': "forcing_dt must be specified for periodic_forcing",
+        }
+        for key, msg in required.items():
+            self.prohibit(self.get(key) is None, msg)
src/simulation/m_ibm.fpp (1)

154-228: Add explicit private lists to periodic-buffer GPU loops.
If the macro doesn’t implicitly privatize j/k/l, these loops can race. Please verify and add private='[j,k,l]' (apply to all six loops). As per coding guidelines, declare loop-local variables in GPU parallel macros.

♻️ Example (apply to all six loops)
-                $:GPU_PARALLEL_LOOP(collapse=3)
+                $:GPU_PARALLEL_LOOP(collapse=3, private='[j,k,l]')
src/simulation/m_volume_filtering.fpp (2)

721-729: Consider grouping arguments to keep subroutines ≤6 params.
Both s_setup_terms_filtering and s_filter_batch exceed 6 arguments. A derived-type params struct would keep signatures manageable. As per coding guidelines, consider consolidating related fields.

Also applies to: 1037-1045


92-95: Verify mixed‑precision slab buffers are intentional.
Slab buffers and MPI traffic use complex(c_float_complex)/MPI_COMPLEX while FFTs operate on double‑complex. This introduces implicit casts and potential precision loss. If double precision is required end‑to‑end, align buffer kinds and MPI datatypes.

🔧 Possible alignment (if double precision is desired)
-    complex(c_float_complex), allocatable :: data_cmplx_slabz(:, :, :), data_cmplx_slaby(:, :, :)
+    complex(c_double_complex), allocatable :: data_cmplx_slabz(:, :, :), data_cmplx_slaby(:, :, :)

-        call MPI_Alltoall(sendbuf_sf, NxC*Nyloc*Nzloc, MPI_COMPLEX, &
-                          recvbuf_sf, NxC*Nyloc*Nzloc, MPI_COMPLEX, MPI_COMM_WORLD, ierr)
+        call MPI_Alltoall(sendbuf_sf, NxC*Nyloc*Nzloc, MPI_DOUBLE_COMPLEX, &
+                          recvbuf_sf, NxC*Nyloc*Nzloc, MPI_DOUBLE_COMPLEX, MPI_COMM_WORLD, ierr)

Also applies to: 895-897, 933-935, 973-975, 1015-1017

Comment on lines +161 to +169
subroutine s_compute_statistics_momentum_unclosed_terms(t_step, t_step_start_stats, reynolds_stress, eff_visc, int_mom_exch, q_cons_filtered, filtered_pressure)
type(scalar_field), dimension(num_dims, num_dims), intent(in) :: reynolds_stress
type(scalar_field), dimension(num_dims, num_dims), intent(in) :: eff_visc
type(scalar_field), dimension(num_dims), intent(in) :: int_mom_exch
type(scalar_field), dimension(sys_size), intent(in) :: q_cons_filtered
type(scalar_field), intent(in) :: filtered_pressure
integer, intent(in) :: t_step
integer, intent(in) :: t_step_start_stats

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Reduce argument count via a params struct.

This routine now exceeds the ≤6 argument guideline; consider packing the unclosed-term fields into a derived-type params struct. As per coding guidelines, routine arguments should be ≤6.

🤖 Prompt for AI Agents
In `@src/simulation/m_compute_statistics.fpp` around lines 161 - 169, The
subroutine s_compute_statistics_momentum_unclosed_terms currently exceeds the
≤6-argument guideline; create a derived type (e.g., type ::
momentum_unclosed_params) that bundles the unclosed-term fields
(reynolds_stress, eff_visc, int_mom_exch, q_cons_filtered, filtered_pressure)
and replace those five arguments in s_compute_statistics_momentum_unclosed_terms
with a single intent(in) parameter of that new type, keeping only the remaining
scalar arguments (t_step, t_step_start_stats) as separate arguments; update all
call sites to construct and pass the new params struct and adjust any internal
references to use the struct members.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 23, 2026

CodeAnt AI is running Incremental review


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added size:XXL This PR changes 1000+ lines, ignoring generated files and removed size:XXL This PR changes 1000+ lines, ignoring generated files labels Jan 23, 2026
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 23, 2026

CodeAnt AI Incremental review completed.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 16 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/simulation/m_ibm.fpp">

<violation number="1" location="src/simulation/m_ibm.fpp:70">
P1: Conditional allocation of levelset/levelset_norm leaves later unconditional uses on unallocated arrays when store_levelset is false, which will crash or corrupt memory in 3D runs.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

-buff_size:n+buff_size, -buff_size:p+buff_size, 1:num_ibs))
@:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, &
-buff_size:n+buff_size, -buff_size:p+buff_size, 1:num_ibs, 1:3))
if (store_levelset) then
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Conditional allocation of levelset/levelset_norm leaves later unconditional uses on unallocated arrays when store_levelset is false, which will crash or corrupt memory in 3D runs.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/simulation/m_ibm.fpp, line 70:

<comment>Conditional allocation of levelset/levelset_norm leaves later unconditional uses on unallocated arrays when store_levelset is false, which will crash or corrupt memory in 3D runs.</comment>

<file context>
@@ -67,10 +67,12 @@ contains
-                -buff_size:n+buff_size, -buff_size:p+buff_size, 1:num_ibs))
-            @:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, &
-                -buff_size:n+buff_size, -buff_size:p+buff_size, 1:num_ibs, 1:3))
+            if (store_levelset) then
+                @:ALLOCATE(levelset%sf(-buff_size:m+buff_size, &
+                    -buff_size:n+buff_size, -buff_size:p+buff_size, 1:num_ibs))
</file context>
Fix with Cubic

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/simulation/m_ibm.fpp (1)

76-83: 2D case allocates levelset unconditionally, inconsistent with 3D and deallocation.

When p == 0 (2D), lines 79-82 allocate levelset%sf and levelset_norm%sf unconditionally, but lines 1321-1324 only deallocate them when store_levelset is true. This creates a memory leak in 2D when store_levelset = .false..

🐛 Suggested fix
         else
             @:ALLOCATE(ib_markers%sf(-buff_size:m+buff_size, &
                 -buff_size:n+buff_size, 0:0))
-            @:ALLOCATE(levelset%sf(-buff_size:m+buff_size, &
-                -buff_size:n+buff_size, 0:0, 1:num_ibs))
-            @:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, &
-                -buff_size:n+buff_size, 0:0, 1:num_ibs, 1:3))
+            if (store_levelset) then
+                @:ALLOCATE(levelset%sf(-buff_size:m+buff_size, &
+                    -buff_size:n+buff_size, 0:0, 1:num_ibs))
+                @:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, &
+                    -buff_size:n+buff_size, 0:0, 1:num_ibs, 1:3))
+            end if
         end if
🤖 Fix all issues with AI agents
In `@examples/3D_random_sphere_array/case.py`:
- Around line 36-38: The code assumes sphere_loc is 2-D but np.loadtxt can
return a 1-D array for single-row files; change the loading to ensure a 2-D
array and compute count from rows by wrapping the result with np.atleast_2d
(e.g., sphere_loc = np.atleast_2d(np.loadtxt(...))) and replace len(sphere_loc)
with sphere_loc.shape[0] so references like sphere_loc[i, 0] work safely.
- Around line 164-166: case_dict may contain NumPy scalar values from
computations (e.g., np.sqrt or array indexing) which json.dumps cannot
serialize; update the serialization to convert NumPy types to native Python
types by passing a default handler or by casting values before dumping. For
example, after case_dict.update(ib_dict) call json.dumps(case_dict,
default=lambda o: o.item() if hasattr(o, "item") else o) or pre-walk
case_dict/ib_dict to replace NumPy scalars with float/int via .item(); reference
the variables case_dict, ib_dict and the json.dumps call to locate where to
apply this change.

In `@src/common/m_ib_patches.fpp`:
- Around line 87-89: The initialization currently calls levelset routines
unconditionally for several geometry types and must be protected by the same
store_levelset guard used for spheres: wrap each call that computes levelset and
levelset_norm (e.g., calls like s_cuboid_levelset, s_cylinder_levelset,
s_airfoil3d_levelset and the 2D variants such as s_circle_levelset,
s_rectangle_levelset, s_airfoil2d_levelset, s_ellipse_levelset) in if
(store_levelset) then ... end if, so levelset/levelset_norm are only referenced
when they have been allocated; apply this change to every geometry branch in
m_ib_patches.fpp where levelset routines are invoked.

In `@src/post_process/m_data_input.f90`:
- Around line 535-537: Several statements in src/post_process/m_data_input.f90
(notably the conditional invoking s_read_parallel_filtered_data with
q_filtered_wrt, t_step and t_step_stop) exceed 100 characters; wrap those long
Fortran source lines using the continuation character (&) so they satisfy
Fortitude S001. Locate the long conditional and the long call to
s_read_parallel_filtered_data (and the other long lines around the ranges
mentioned: the block from ~671–703 and the line near ~762) and split argument
lists and logical expressions onto subsequent lines with trailing & on the
broken line and leading &/aligned indentation on the continuation lines,
preserving the original order of tokens and parentheses. Ensure no token is
split across lines and keep code readable by grouping related arguments (e.g.,
t_step checks on one line, the call arguments on subsequent lines) while keeping
each resulting line under 100 chars.

In `@src/pre_process/m_initial_condition.fpp`:
- Around line 102-107: The code allocates levelset and levelset_norm only when
store_levelset is true but later calls s_apply_ib_patches and other geometry
routines that unconditionally pass levelset/levelset_norm, causing potential
dereference of unallocated arrays; update callers (e.g., the s_apply_ib_patches
invocation that currently passes ib_markers%sf, levelset, levelset_norm) to
conditionally pass those optional arguments or use the alternate call when
store_levelset is false, and also wrap any computations that write/read
levelset/levelset_norm in an if (store_levelset) guard so non-spherical IBs
never touch unallocated levelset or levelset_norm storage.

In `@src/simulation/m_additional_forcing.fpp`:
- Around line 111-115: The kinetic energy term currently hardcodes three
momentum components (momxb, momxb+1, momxb+2); change it to compute ke by
looping over offsets 0..(num_dims-1) and summing
q_cons_vf(momxb+offset)%sf(i,j,k)**2, then divide by rho and use 0.5*ke in the
spatial_eps update (keep E_idx, dVol and rho usage). Declare and initialize a
local ke variable inside the GPU loop, add it to the loop's private clause, and
replace the explicit three-component expression with the loop-based ke
calculation so the code works for 1D/2D/3D.
- Around line 48-52: The calculation for domain_vol unconditionally uses
domain_glb(3,*) which causes out-of-bounds in 2D; update the block around
domain_vol to compute volume based on num_dims (use 3D product when num_dims==3,
2D product when num_dims==2, and 1D when num_dims==1) before computing
avg_coeff, ensuring domain_glb indices are only accessed for existing dimensions
and preserving the subsequent avg_coeff = 1._wp/(domain_vol*(1._wp -
volfrac_phi)).
♻️ Duplicate comments (15)
src/pre_process/m_data_output.fpp (1)

842-865: Fix MPI datatype/count for levelset and levelset_norm writes.
Using mpi_p and raw element counts bypasses mixed-precision sizing; use mpi_io_p and scale by mpi_io_type.

🐛 Proposed fix
-                call MPI_FILE_SET_VIEW(ifile, disp, mpi_p, MPI_IO_levelset_DATA%view, &
+                call MPI_FILE_SET_VIEW(ifile, disp, mpi_io_p, MPI_IO_levelset_DATA%view, &
                                        'native', mpi_info_int, ierr)
-                call MPI_FILE_WRITE_ALL(ifile, MPI_IO_levelset_DATA%var%sf, data_size*num_ibs, &
-                                        mpi_p, status, ierr)
+                call MPI_FILE_WRITE_ALL(ifile, MPI_IO_levelset_DATA%var%sf, data_size*num_ibs*mpi_io_type, &
+                                        mpi_io_p, status, ierr)-                call MPI_FILE_SET_VIEW(ifile, disp, mpi_p, MPI_IO_levelsetnorm_DATA%view, &
+                call MPI_FILE_SET_VIEW(ifile, disp, mpi_io_p, MPI_IO_levelsetnorm_DATA%view, &
                                        'native', mpi_info_int, ierr)
-                call MPI_FILE_WRITE_ALL(ifile, MPI_IO_levelsetnorm_DATA%var%sf, data_size*num_ibs*3, &
-                                        mpi_p, status, ierr)
+                call MPI_FILE_WRITE_ALL(ifile, MPI_IO_levelsetnorm_DATA%var%sf, data_size*num_ibs*3*mpi_io_type, &
+                                        mpi_io_p, status, ierr)
src/post_process/m_global_parameters.fpp (1)

1066-1070: Deallocate q_filtered_wrt MPI_IO buffers before nulling.
The extra stats slots are allocated but only nullified here, which leaks memory. Based on learnings, please use the device-aware deallocation macro.

🐛 Proposed fix
             if (q_filtered_wrt) then
                 do i = sys_size + 1, sys_size + volume_filter_dt%stat_size
-                    MPI_IO_DATA%var(i)%sf => null()
+                    if (associated(MPI_IO_DATA%var(i)%sf)) then
+                        @:DEALLOCATE(MPI_IO_DATA%var(i)%sf)
+                    end if
+                    MPI_IO_DATA%var(i)%sf => null()
                 end do
             end if
toolchain/mfc/case_validator.py (2)

1111-1124: Remove trailing whitespace in the new validators (lint failure).
Pylint is failing on trailing whitespace in these added blocks; trimming resolves the pipeline error.

Also applies to: 1162-1200


1194-1195: Fix the even-dimension parity check.
The current (dim + 1) % 2 test rejects valid even sizes; check the dimension directly.

🐛 Proposed fix
-            self.prohibit((m_glb + 1) % 2 != 0 or (n_glb + 1) % 2 != 0 or (p_glb + 1) % 2 != 0,
+            self.prohibit(m_glb % 2 != 0 or n_glb % 2 != 0 or p_glb % 2 != 0,
                          "volume_filter_momentum_eqn requires global dimensions divisible by 2 in every direction")
src/common/m_mpi_common.fpp (1)

386-391: Avoid overwriting base MPI views when initializing filtered views.
The loop starts at sys_size, which can reinitialize the last base variable’s view. Start at sys_size + 1 instead.

🐛 Proposed fix
-        do i = sys_size, alt_sys
+        do i = sys_size + 1, alt_sys
             call MPI_TYPE_CREATE_SUBARRAY(num_dims, sizes_glb, sizes_loc, start_idx, &
                                           MPI_ORDER_FORTRAN, mpi_p, MPI_IO_DATA%view(i), ierr)
             call MPI_TYPE_COMMIT(MPI_IO_DATA%view(i), ierr)
         end do
src/common/m_ib_patches.fpp (1)

619-665: Avoid shared-state coordinate conversion inside the GPU loop.
s_convert_cylindrical_to_cartesian_coord writes module-level cart_y/cart_z, which is unsafe in parallel loops. Use the pure function with thread-local temporaries instead.

🐛 Suggested fix (thread‑local conversion)
-                        if (grid_geometry == 3) then
-                            call s_convert_cylindrical_to_cartesian_coord(y_cc(j), z_cc(k))
-                        else
-                            cart_y = y_cc(j)
-                            cart_z = z_cc(k)
-                        end if
+                        if (grid_geometry == 3) then
+                            tmp_cyl = (/0._wp, y_cc(j), z_cc(k)/)
+                            tmp_cart = f_convert_cyl_to_cart(tmp_cyl)
+                            cart_y = tmp_cart(2)
+                            cart_z = tmp_cart(3)
+                        else
+                            cart_y = y_cc(j)
+                            cart_z = z_cc(k)
+                        end if
src/simulation/m_data_output.fpp (2)

85-123: Serial I/O path does not forward filtered data arguments.

When parallel_io is false (Line 117-118), the call to s_write_serial_data_files does not pass the new filtered arguments (filtered_fluid_indicator_function, stat_q_cons_filtered, etc.). If q_filtered_wrt is enabled with serial I/O, the filtered data will not be written.

Either extend s_write_serial_data_files to accept and handle these arguments, or add a runtime check/abort when q_filtered_wrt is true and parallel_io is false.


801-814: Argument count exceeds the ≤6 guideline.

s_write_parallel_data_files now accepts 10 arguments. As per coding guidelines, consider introducing a derived-type params struct to bundle the filtered/stat fields and reduce the signature complexity.

src/simulation/m_ibm.fpp (2)

625-634: Off-by-one risk: loop accesses s_cc(index + 1) but bound allows index == bound.

The bound calculation at lines 627, 630, 633 sets bound = m/n/p + buff_size, but the while-loop at line 648-656 accesses s_cc(index + 1). When index == bound, s_cc(index + 1) reads past the allocated array bounds.

🐛 Suggested fix
                 if (dim == 1) then
                     s_cc => x_cc
-                    bound = m + buff_size
+                    bound = m + buff_size - 1
                 elseif (dim == 2) then
                     s_cc => y_cc
-                    bound = n + buff_size
+                    bound = n + buff_size - 1
                 else
                     s_cc => z_cc
-                    bound = p + buff_size
+                    bound = p + buff_size - 1
                 end if

651-655: Replace error stop with s_mpi_abort for MPI safety.

Using error stop can leave other MPI ranks hanging. As per coding guidelines, use s_mpi_abort for error termination.

🐛 Suggested fix
                         if (index < -buff_size .or. index > bound) then
-                            print *, "proc_rank=", proc_rank, "temp_loc=", temp_loc, " index=", index, "ib=", patch_id, "dim", dim, "dir", dir, "i, j, k", i, j, k
-                            print *, "Increase buff_size further in m_helper_basic (currently set to a minimum of 10)"
-                            error stop "Increase buff_size"
+                            call s_mpi_abort("Image point search exceeded buffer bounds. Increase buff_size (minimum 10).")
                         end if
src/simulation/m_additional_forcing.fpp (1)

170-172: Division by zero risk when local density is zero.

Line 172 divides by rho, which is computed by summing fluid densities. At phase interfaces or in regions where all fluids are absent, rho could be zero or near-zero, causing division by zero or numerical instability.

🐛 Suggested fix
                     ! f_E
-                    q_periodic_force(3)%sf(i, j, k) = (P_inf_ref*gammas(1) - phase_eps)*forcing_dt &
-                                                      + q_cons_vf(contxe + mom_f_idx)%sf(i, j, k)*q_periodic_force(2)%sf(i, j, k)/rho
+                    if (rho > 1.e-12_wp) then
+                        q_periodic_force(3)%sf(i, j, k) = (P_inf_ref*gammas(1) - phase_eps)*forcing_dt &
+                                                          + q_cons_vf(contxe + mom_f_idx)%sf(i, j, k)*q_periodic_force(2)%sf(i, j, k)/rho
+                    else
+                        q_periodic_force(3)%sf(i, j, k) = (P_inf_ref*gammas(1) - phase_eps)*forcing_dt
+                    end if
src/simulation/m_volume_filtering.fpp (4)

214-216: Misleading error message: references 'p' but checks 'Ny'.

The check validates mod(Ny, num_procs) but the error message incorrectly states the requirement is for p.

✏️ Suggested fix
         if (mod(Ny, num_procs) /= 0) then
-            call s_mpi_abort('Volume filtering requires p to be divisible by the number of ranks')
+            call s_mpi_abort('Volume filtering requires Ny to be divisible by the number of ranks')
         end if

585-598: Critical: Out-of-bounds array access in finite-difference stencil.

The loop iterates i = 0, m, j = 0, n, k = 0, p, but the stencil accesses i + l where l ∈ [-4, 4]. At boundary cells (e.g., i = 0), this accesses fluid_indicator_function%sf(-4, j, k), causing out-of-bounds memory access unless idwbuff provides sufficient ghost cells.

Verify that idwbuff bounds provide at least 4 ghost cells in each direction, or restrict the loop ranges:

🐛 Option 1: Restrict loop ranges (safer)
         $:GPU_PARALLEL_LOOP(collapse=3, copyin='[fd_coeffs]')
-        do i = 0, m
-            do j = 0, n
-                do k = 0, p
+        do i = 4, m - 4
+            do j = 4, n - 4
+                do k = 4, p - 4
                     $:GPU_LOOP(parallelism='[seq]')
                     do l = -4, 4
🔍 Option 2: Verify buffer allocation
#!/bin/bash
# Check idwbuff allocation bounds to verify sufficient ghost cells
rg -n 'idwbuff' --type-add 'fpp:*.fpp' --type=fpp -C3 | head -60

785-787: Potential division by zero on filtered density.

When q_cons_filtered(1)%sf(i, j, k) (filtered density) approaches zero inside solid regions, this division produces NaN/Inf values.

🐛 Proposed fix with epsilon floor
+                            real(wp), parameter :: eps_rho = 1.0e-12_wp
                             reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) &
-                                                                - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k)/q_cons_filtered(1)%sf(i, j, k))
+                                                                - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k) &
+                                                                   / max(q_cons_filtered(1)%sf(i, j, k), eps_rho))

1552-1560: Multiple divisions by filtered_fluid_indicator_function risk NaN in solid regions.

Lines 1556, 1605, 1655, and 1707 all divide by filtered_fluid_indicator_function%sf(i, j, k), which is zero inside particles. This systematically produces NaN/Inf values throughout the filtered fields.

Apply a consistent epsilon guard across all four locations:

🐛 Proposed pattern for all divisions

Define a module-level constant and apply uniformly:

real(wp), parameter :: eps_indicator = 1.0e-12_wp

Then replace each division:

-q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm/filtered_fluid_indicator_function%sf(i, j, k)
+q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)*fft_norm &
+    / max(filtered_fluid_indicator_function%sf(i, j, k), eps_indicator)

Apply the same pattern at lines 1605, 1655, and 1707.

🧹 Nitpick comments (8)
src/pre_process/m_global_parameters.fpp (1)

949-954: Use @:ALLOCATE for the new levelset MPI buffers.
These allocations are in a .fpp module and should use the device-aware macro. As per coding guidelines, please prefer @:ALLOCATE here.

♻️ Suggested change
-            allocate (MPI_IO_levelset_DATA%var%sf(0:m, 0:n, 0:p, 1:num_ibs))
-            allocate (MPI_IO_levelsetnorm_DATA%var%sf(0:m, 0:n, 0:p, 1:num_ibs, 1:3))
+            @:ALLOCATE(MPI_IO_levelset_DATA%var%sf(0:m, 0:n, 0:p, 1:num_ibs), &
+                       MPI_IO_levelsetnorm_DATA%var%sf(0:m, 0:n, 0:p, 1:num_ibs, 1:3))
src/post_process/m_global_parameters.fpp (1)

872-878: Use @:ALLOCATE for q_filtered_wrt MPI_IO_DATA buffers.
These new allocations are in a .fpp module; using the device-aware macro keeps GPU builds consistent. As per coding guidelines, please prefer @:ALLOCATE here.

♻️ Suggested change
-                allocate (MPI_IO_DATA%var(i)%sf(0:m, 0:n, 0:p))
+                @:ALLOCATE(MPI_IO_DATA%var(i)%sf(0:m, 0:n, 0:p))
src/post_process/m_data_input.f90 (1)

671-680: Consider bundling MPI I/O params into a derived type.
This signature has 8 parameters; grouping the MPI offsets into a small derived type would align with the ≤6-argument guideline and simplify call sites. As per coding guidelines, ...

src/simulation/m_ibm.fpp (2)

713-723: Remove or enable commented-out code.

Lines 714-723 contain commented-out alternative logic for 1D subsection checks. If this code is no longer needed, remove it to reduce clutter. If it's intended for future use, consider adding a TODO comment explaining its purpose.


805-811: Remove or enable commented-out code.

Lines 806-811 contain the same commented-out 1D subsection logic pattern. This duplicates the commented code in s_find_num_ghost_points. Either remove both or document the intent.

src/simulation/m_volume_filtering.fpp (3)

339-343: Consider using newunit= for automatic unit assignment.

Hardcoded unit number 100 may conflict with other file operations in the codebase. Using newunit= provides safer automatic assignment.

♻️ Suggested improvement
+        integer, save :: particle_force_unit
         ! file for particle forces
         if (compute_particle_drag) then
             if (proc_rank == 0) then
-                open (unit=100, file='particle_force.bin', status='replace', form='unformatted', access='stream', action='write')
+                open (newunit=particle_force_unit, file='particle_force.bin', status='replace', form='unformatted', access='stream', action='write')
             end if
         end if

Then update the write and close statements to use particle_force_unit instead of 100.


1037-1765: Subroutine s_filter_batch exceeds size guidelines (~730 lines).

Per coding guidelines, subroutines should be ≤ 500 lines. This subroutine contains highly repetitive FFT transform patterns for different field types (conservative vars, pressure, Reynolds stress, effective viscosity, interphase momentum exchange).

Consider extracting reusable helper routines:

  1. s_fft_forward_xy: Handles X FFT → reorder → Y FFT → store to batch slab
  2. s_fft_z_convolve: Handles Z FFT → convolution → inverse Z FFT
  3. s_fft_backward_yx: Handles Y inverse FFT → reorder → X inverse FFT → store result

This would reduce s_filter_batch to ~100-150 lines of orchestration logic and improve maintainability.


3-2015: File exceeds 1000-line guideline (~2015 lines).

The coding guidelines recommend files ≤ 1000 lines. Consider splitting into focused submodules:

  • m_volume_filtering: Public API, orchestration, core filtering workflow
  • m_volume_filtering_fft: FFT plans, forward/backward transforms, slab transposes
  • m_volume_filtering_terms: Stress tensor, Reynolds stress, effective viscosity computations

@codecov
Copy link

codecov bot commented Jan 24, 2026

Codecov Report

❌ Patch coverage is 30.97964% with 1085 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.00%. Comparing base (944aa2f) to head (8841877).

Files with missing lines Patch % Lines
src/simulation/m_volume_filtering.fpp 27.25% 513 Missing and 2 partials ⚠️
src/post_process/m_data_input.f90 0.00% 93 Missing and 3 partials ⚠️
src/common/m_mpi_common.fpp 28.80% 88 Missing and 1 partial ⚠️
src/simulation/m_compute_statistics.fpp 27.38% 61 Missing ⚠️
src/simulation/m_start_up.fpp 34.21% 42 Missing and 8 partials ⚠️
src/simulation/m_ibm.fpp 45.34% 36 Missing and 11 partials ⚠️
src/common/m_compute_levelset.fpp 13.46% 42 Missing and 3 partials ⚠️
src/simulation/m_additional_forcing.fpp 53.08% 38 Missing ⚠️
src/post_process/m_start_up.fpp 0.00% 35 Missing and 1 partial ⚠️
src/common/m_boundary_common.fpp 49.23% 26 Missing and 7 partials ⚠️
... and 10 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1043      +/-   ##
==========================================
- Coverage   44.16%   43.00%   -1.17%     
==========================================
  Files          71       74       +3     
  Lines       20417    21923    +1506     
  Branches     1991     2067      +76     
==========================================
+ Hits         9018     9428     +410     
- Misses      10252    11308    +1056     
- Partials     1147     1187      +40     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 26, 2026

CodeAnt AI is running Incremental review


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added size:XXL This PR changes 1000+ lines, ignoring generated files and removed size:XXL This PR changes 1000+ lines, ignoring generated files labels Jan 26, 2026
do j = 0, n
do i = 0, m
if (grid_geometry == 3) then
call s_convert_cylindrical_to_cartesian_coord(y_cc(j), z_cc(k))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The call to s_convert_cylindrical_to_cartesian_coord writes module-level variables cart_y/cart_z; inside the GPU-parallel loop those names are declared private — the subroutine still updates the shared module variables, causing data races and incorrect values. Compute cartesian coordinates into the local/private variables directly inside the loop instead of calling the subroutine. [race condition]

Severity Level: Critical 🚨
- ❌ Sphere IB placement incorrect for periodic cylindrical grids.
- ⚠️ s_apply_ib_patches -> s_ib_sphere affected.
- ⚠️ Non-deterministic ib_markers_sf assignments observed.
Suggested change
call s_convert_cylindrical_to_cartesian_coord(y_cc(j), z_cc(k))
cart_y = y_cc(j)*sin(z_cc(k))
cart_z = y_cc(j)*cos(z_cc(k))
Steps of Reproduction ✅
1. Ensure there is a spherical IB patch: call path s_apply_ib_patches -> s_ib_sphere. In
src/common/m_ib_patches.fpp the call to s_ib_sphere is invoked from s_apply_ib_patches
(see the loop that calls `call s_ib_sphere(i, ib_markers_sf)` around lines added earlier
in the PR, e.g. ~line 86 in the new-hunk).

2. Enable periodic immersed boundaries and use cylindrical grid geometry: set
`periodic_ibs = .true.` and `grid_geometry == 3`. This causes s_ib_sphere to take the
periodic branch which contains the GPU loop starting at the $:GPU_PARALLEL_LOOP line shown
at src/common/m_ib_patches.fpp:622.

3. Enter the GPU-parallel loop (src/common/m_ib_patches.fpp:623-631). The loop privatizes
`cart_y`/`cart_z` locally, but calls `s_convert_cylindrical_to_cartesian_coord(y_cc(j),
z_cc(k))` which writes the module-level variables `cart_y` and `cart_z` (the subroutine is
defined in the same file: subroutine s_convert_cylindrical_to_cartesian_coord in
src/common/m_ib_patches.fpp).

4. Observe non-deterministic or incorrect ib_markers_sf writes (wrong cells marked)
because multiple threads execute the subroutine which updates the shared module variables,
producing a data race. Reproduced by running a parallel GPU/accelerator build with a
spherical IB in a cylindrical, periodic domain.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/common/m_ib_patches.fpp
**Line:** 627:627
**Comment:**
	*Race Condition: The call to `s_convert_cylindrical_to_cartesian_coord` writes module-level variables `cart_y`/`cart_z`; inside the GPU-parallel loop those names are declared private — the subroutine still updates the shared module variables, causing data races and incorrect values. Compute cartesian coordinates into the local/private variables directly inside the loop instead of calling the subroutine.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

Comment on lines +120 to +123
call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta, &
filtered_fluid_indicator_function, &
stat_q_cons_filtered, stat_filtered_pressure, &
stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The call to s_write_parallel_data_files unconditionally passes several optional dummy arguments (e.g. filtered_fluid_indicator_function, stat_q_cons_filtered, stat_filtered_pressure, stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch) from s_write_data_files even when they might not have been provided to s_write_data_files. In Fortran you must not pass an optional actual argument that is not present; this can lead to undefined behaviour or runtime/compile problems. Fix by calling the parallel writer with the additional optional arguments only when at least one of them is present; otherwise call the routine with only the mandatory arguments. [possible bug]

Severity Level: Critical 🚨
- ❌ Parallel restart output may crash at s_write_parallel_data_files.
- ❌ MPI-file writes risk memory corruption.
- ⚠️ Filtered-statistics output unreliable or omitted.
- ⚠️ Affects any run using parallel_io and missing optionals.
Suggested change
call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta, &
filtered_fluid_indicator_function, &
stat_q_cons_filtered, stat_filtered_pressure, &
stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
if (present(filtered_fluid_indicator_function) .or. present(stat_q_cons_filtered) .or. &
present(stat_filtered_pressure) .or. present(stat_reynolds_stress) .or. &
present(stat_eff_visc) .or. present(stat_int_mom_exch)) then
call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta, &
filtered_fluid_indicator_function, &
stat_q_cons_filtered, stat_filtered_pressure, &
stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
else
call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta)
end if
Steps of Reproduction ✅
1. Enter subroutine s_write_data_files in module m_data_output
(src/simulation/m_data_output.fpp,

   subroutine header and body span the PR hunk lines ~82-124). The conditional at line 117

   chooses the parallel branch when parallel_io is true.

2. With parallel_io true the code executes the else-branch and reaches the call at lines

   120-123 which unconditionally passes filtered_fluid_indicator_function and the other

   filtered-statistics variables to s_write_parallel_data_files.

3. If the caller invoked s_write_data_files without supplying those optional actual
arguments

   (they are absent in s_write_data_files), then the variables are absent optionals inside

   s_write_data_files; passing an absent optional actual to another procedure is illegal

   in Fortran. The callee s_write_parallel_data_files is defined in this file (see its

   header around lines ~798-820).

4. Expected outcome: depending on compiler/ABI this can produce a compile-time diagnostic

   or undefined runtime behaviour (crash, memory corruption) when executing the call at

   src/simulation/m_data_output.fpp:120-123. The correct fix is to only forward optionals

   when present (or provide a two-call conditional as in the improved_code).
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/simulation/m_data_output.fpp
**Line:** 120:123
**Comment:**
	*Possible Bug: The call to `s_write_parallel_data_files` unconditionally passes several optional dummy arguments (e.g. `filtered_fluid_indicator_function`, `stat_q_cons_filtered`, `stat_filtered_pressure`, `stat_reynolds_stress`, `stat_eff_visc`, `stat_int_mom_exch`) from `s_write_data_files` even when they might not have been provided to `s_write_data_files`. In Fortran you must not pass an optional actual argument that is not present; this can lead to undefined behaviour or runtime/compile problems. Fix by calling the parallel writer with the additional optional arguments only when at least one of them is present; otherwise call the routine with only the mandatory arguments.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

Comment on lines 848 to 853
if (present(beta)) then
alt_sys = sys_size + 1
else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
alt_sys = sys_size + volume_filter_dt%stat_size
else
alt_sys = sys_size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The computation of alt_sys uses an if/else-if chain so only one extra contribution is added: either +1 for beta or +volume_filter_dt%stat_size for filtered statistics, but not both. If both beta and filtered statistics should be written in the same run step this loses one of them. Compute alt_sys by starting from sys_size and adding each contribution independently (check present(beta) and the filtered condition separately). [logic error]

Severity Level: Critical 🚨
- ❌ MPI restart variable count (NVARS_MOK) may be wrong.
- ❌ Restart/restart_data file layout corrupted.
- ⚠️ Filtered statistics output omitted unexpectedly.
Suggested change
if (present(beta)) then
alt_sys = sys_size + 1
else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
alt_sys = sys_size + volume_filter_dt%stat_size
else
alt_sys = sys_size
alt_sys = sys_size
if (present(beta)) then
alt_sys = alt_sys + 1
end if
if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
alt_sys = alt_sys + volume_filter_dt%stat_size
Steps of Reproduction ✅
1. In s_write_parallel_data_files (src/simulation/m_data_output.fpp) the altered system
size

   alt_sys is computed at lines 848-854 using an if / else if chain.

2. Scenario: both a beta correction is present (present(beta) true) and
filtered-statistics are

   scheduled to be written (q_filtered_wrt true and t_step matches 0 or t_step_stop).

3. Current code picks the first branch (present(beta)) and sets alt_sys = sys_size + 1,

   skipping the addition of volume_filter_dt%stat_size. Later the code uses alt_sys when

   deciding how many variables to write (for example the do i = 1, alt_sys loop added at

   src/simulation/m_data_output.fpp line 1019 of the PR hunk), and when setting NVARS_MOK

   for MPI offsets (NVARS_MOK = int(alt_sys, MPI_OFFSET_KIND) in the MFC_MPI path).

4. Observe: the file layout/loop count will omit the filtered-statistics block (or omit
the

   beta block depending on ordering), producing truncated or misaligned MPI writes and

   corrupted or incomplete restart/statistics files. The improved_code sums both
   contributions.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/simulation/m_data_output.fpp
**Line:** 848:853
**Comment:**
	*Logic Error: The computation of `alt_sys` uses an if/else-if chain so only one extra contribution is added: either `+1` for `beta` or `+volume_filter_dt%stat_size` for filtered statistics, but not both. If both `beta` and filtered statistics should be written in the same run step this loses one of them. Compute `alt_sys` by starting from `sys_size` and adding each contribution independently (check `present(beta)` and the filtered condition separately).

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 26, 2026

CodeAnt AI Incremental review completed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/simulation/m_ibm.fpp (1)

76-83: Inconsistent allocation logic between 2D and 3D cases.

In the 3D case (lines 70-75), levelset%sf and levelset_norm%sf are only allocated when store_levelset is true. However, in the 2D case (lines 79-82), these arrays are allocated unconditionally. Since deallocation at lines 1341-1344 is guarded by store_levelset, this creates a memory leak in 2D when store_levelset=F.

🐛 Suggested fix
         else
             @:ALLOCATE(ib_markers%sf(-buff_size:m+buff_size, &
                 -buff_size:n+buff_size, 0:0))
-            @:ALLOCATE(levelset%sf(-buff_size:m+buff_size, &
-                -buff_size:n+buff_size, 0:0, 1:num_ibs))
-            @:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, &
-                -buff_size:n+buff_size, 0:0, 1:num_ibs, 1:3))
+            if (store_levelset) then
+                @:ALLOCATE(levelset%sf(-buff_size:m+buff_size, &
+                    -buff_size:n+buff_size, 0:0, 1:num_ibs))
+                @:ALLOCATE(levelset_norm%sf(-buff_size:m+buff_size, &
+                    -buff_size:n+buff_size, 0:0, 1:num_ibs, 1:3))
+            end if
         end if
🤖 Fix all issues with AI agents
In `@src/common/m_ib_patches.fpp`:
- Around line 621-644: The GPU parallel loop that marks ib_markers_sf inside the
periodic_ibs block uses $:GPU_PARALLEL_LOOP with private='[cart_y, cart_z, ix,
iy, iz]' but omits the loop indices i, j, k; update the private list for the
GPU_PARALLEL_LOOP surrounding the nested do k=0,p / do j=0,n / do i=0,m block
(the periodic_ibs section that sets ib_markers_sf based on patch_id, center, r2
and calls s_convert_cylindrical_to_cartesian_coord) to also include i, j, k so
those loop counters are private on the device and avoid potential data races.

In `@src/simulation/m_data_output.fpp`:
- Around line 964-968: The call to s_initialize_mpi_data_filtered dereferences
optional arguments when q_filtered_wrt is true (at the t_step check) without
verifying their presence; update the conditional to require present(...) for
each optional actual (filtered_fluid_indicator_function, stat_q_cons_filtered,
stat_filtered_pressure, stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
before calling s_initialize_mpi_data_filtered, and if any are absent emit a
clear error (abort) explaining the missing argument(s) so the routine is never
called with absent optionals.
🧹 Nitpick comments (2)
src/simulation/m_ibm.fpp (1)

682-726: Remove unused variables and commented-out code.

Variables subsection_x, subsection_y, subsection_z (lines 682-684) are declared but only used in commented-out code (lines 716-725). This creates dead code and confusion. Either remove the declarations and comments, or document why this alternative implementation is preserved for future use.

♻️ If removing unused code
         integer, dimension(2*gp_layers + 1, 2*gp_layers + 1, 2*gp_layers + 1) &
             :: subsection_3D
-        integer, dimension(2*gp_layers + 1) :: subsection_x
-        integer, dimension(2*gp_layers + 1) :: subsection_y
-        integer, dimension(2*gp_layers + 1) :: subsection_z
         integer :: i, j, k!< Iterator variables

And remove the commented blocks at lines 716-725.

src/common/m_mpi_common.fpp (1)

1606-1616: Consider adding minimum cell count validation for slab decomposition.

The slab decomposition branch only checks divisibility (mod((p + 1), num_procs_z) == 0) but doesn't validate that each processor receives at least num_stcls_min*recon_order cells in the z-direction, unlike the other decomposition branches. This could allow invalid decompositions with too few cells for the stencil.

♻️ Suggested validation
                 else if (slab_domain_decomposition) then
                     num_procs_x = 1
                     num_procs_y = 1
                     num_procs_z = num_procs
                     ierr = -1
-                    if (mod((p + 1), num_procs_z) == 0) then
+                    if (mod((p + 1), num_procs_z) == 0 .and. &
+                        (p + 1)/num_procs_z >= num_stcls_min*recon_order) then
                         ierr = 0
                     end if

Comment on lines +621 to +644
if (periodic_ibs) then
$:GPU_PARALLEL_LOOP(collapse=3, private='[cart_y, cart_z, ix, iy, iz]', copy='[ib_markers_sf]', copyin='[patch_id, center, r2]')
do k = 0, p
do j = 0, n
do i = 0, m
if (grid_geometry == 3) then
call s_convert_cylindrical_to_cartesian_coord(y_cc(j), z_cc(k))
else
cart_y = y_cc(j)
cart_z = z_cc(k)
end if
do ix = 1, 2
do iy = 1, 2
do iz = 1, 2
if (((x_cc(i) - center(1, ix))**2 + (cart_y - center(2, iy))**2 + (cart_z - center(3, iz))**2) <= r2) then
ib_markers_sf(i, j, k) = patch_id
end if
end do
end do
end do
end do
end do
end do
end do
$:END_GPU_PARALLEL_LOOP()
$:END_GPU_PARALLEL_LOOP()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing loop indices in GPU parallel loop private list.

The private list on line 622 includes [cart_y, cart_z, ix, iy, iz] but omits the outer loop indices i, j, k. Other GPU loops in this file (e.g., lines 170, 646, 709) consistently include loop indices in their private lists. Missing these could cause data races or incorrect behavior on GPU.

🐛 Proposed fix
-$:GPU_PARALLEL_LOOP(collapse=3, private='[cart_y, cart_z, ix, iy, iz]', copy='[ib_markers_sf]', copyin='[patch_id, center, r2]')
+$:GPU_PARALLEL_LOOP(collapse=3, private='[i, j, k, cart_y, cart_z, ix, iy, iz]', copy='[ib_markers_sf]', copyin='[patch_id, center, r2]')
🤖 Prompt for AI Agents
In `@src/common/m_ib_patches.fpp` around lines 621 - 644, The GPU parallel loop
that marks ib_markers_sf inside the periodic_ibs block uses $:GPU_PARALLEL_LOOP
with private='[cart_y, cart_z, ix, iy, iz]' but omits the loop indices i, j, k;
update the private list for the GPU_PARALLEL_LOOP surrounding the nested do
k=0,p / do j=0,n / do i=0,m block (the periodic_ibs section that sets
ib_markers_sf based on patch_id, center, r2 and calls
s_convert_cylindrical_to_cartesian_coord) to also include i, j, k so those loop
counters are private on the device and avoid potential data races.

Comment on lines +964 to +968
if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
call s_initialize_mpi_data_filtered(filtered_fluid_indicator_function, &
stat_q_cons_filtered, stat_filtered_pressure, &
stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
end if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard optional filtered args before initializing MPI data.
Line 964 dereferences optional arguments without present() checks when q_filtered_wrt is true. If any are omitted, behavior is undefined.

✅ Proposed fix (add presence guard + abort)
-            if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
-                call s_initialize_mpi_data_filtered(filtered_fluid_indicator_function, &
-                                                    stat_q_cons_filtered, stat_filtered_pressure, &
-                                                    stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
-            end if
+            if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
+                if (.not. (present(filtered_fluid_indicator_function) .and. &
+                           present(stat_q_cons_filtered) .and. present(stat_filtered_pressure) .and. &
+                           present(stat_reynolds_stress) .and. present(stat_eff_visc) .and. &
+                           present(stat_int_mom_exch))) then
+                    call s_mpi_abort('q_filtered_wrt requires all filtered-stat arguments in s_write_parallel_data_files.')
+                end if
+                call s_initialize_mpi_data_filtered(filtered_fluid_indicator_function, &
+                                                    stat_q_cons_filtered, stat_filtered_pressure, &
+                                                    stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
+            end if
🤖 Prompt for AI Agents
In `@src/simulation/m_data_output.fpp` around lines 964 - 968, The call to
s_initialize_mpi_data_filtered dereferences optional arguments when
q_filtered_wrt is true (at the t_step check) without verifying their presence;
update the conditional to require present(...) for each optional actual
(filtered_fluid_indicator_function, stat_q_cons_filtered,
stat_filtered_pressure, stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
before calling s_initialize_mpi_data_filtered, and if any are absent emit a
clear error (abort) explaining the missing argument(s) so the routine is never
called with absent optionals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review effort 4/5 size:XXL This PR changes 1000+ lines, ignoring generated files

Development

Successfully merging this pull request may close these issues.

3 participants