-
Notifications
You must be signed in to change notification settings - Fork 0
get queue elements v2 rpc for code rabbit review #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughReplaces get_batch_address_update_info with get_queue_info; refactors get_queue_elements for per-queue pagination, merged proofs, and cache validation; splits indexed Merkle range-hash into v1/v2; generalizes queue hash cache over ConnectionTrait and defers caching until validation; adds GCS resumable upload and token logic; assorted logging, tests, and manifest updates. Changes
sequenceDiagram
participant Client
participant RPC as RPC_Server
participant API as PhotonApi
participant DB as Database
participant Cache as HashChainCache
participant ZKP as Hasher/ZKP
Client->>RPC: send getQueueElements (per-queue request)
RPC->>API: parse payload, call get_queue_elements
API->>DB: begin txn, fetch tree metadata & queue elements
API->>Cache: get_cached_hash_chains (per-batch)
alt cache hit
Cache-->>API: cached hash chains
else cache miss / partial
API->>ZKP: compute_state_queue_hash_chains
ZKP-->>API: computed hash chains
end
API->>DB: fetch proofs, path nodes, subtrees
DB-->>API: proofs, nodes, subtrees
API->>API: merge_state_queue_proofs & deduplicate nodes
API->>DB: validate on-chain vs computed chains
opt validated chains
API->>Cache: store validated hash chains
end
opt invalid cached indices
API->>Cache: delete_hash_chains (invalid)
end
API->>DB: commit txn
API-->>RPC: assemble response
RPC-->>Client: return response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧬 Code graph analysis (1)src/api/method/get_queue_elements.rs (9)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/ingester/persist/spend.rs (1)
78-98: Add token_accounts handling to match non-batched spend behavior.The addition of
Spent = trueat line 94 correctly aligns the batched updates withspend_input_accounts. However, the batched function omitstoken_accountsentirely—unlike the non-batched version which updates bothaccounts(lines 21–42) andtoken_accounts(lines 44–66) with theSpentflag. This inconsistency means token accounts in batched trees won't be marked as spent.Add token_accounts updates to
spend_input_accounts_batchedto mirror the non-batched implementation:
- Iterate through accounts and update
token_accounts::EntitywithSpent = true(andPrevSpent)- Use the same
execute_account_update_query_and_update_balanceshelper for consistencyAlso update the function comment (line 70) from:
/// Update the nullifier queue index and nullifier of the input accounts in batched trees.to:
/// Mark accounts as spent and update the nullifier queue index and nullifier of the input accounts in batched trees.src/api/api.rs (1)
390-574: MissinggetQueueInfoentry in OpenAPI specifications.The
method_api_specs()function includesgetQueueElementsbut does not includegetQueueInfo. This means the new endpoint won't appear in the generated OpenAPI documentation.If this is intentional (e.g., the endpoint is internal), please disregard. Otherwise, add the spec entry:
OpenApiSpec { name: "getQueueInfo".to_string(), request: Some(GetQueueInfoRequest::schema().1), response: GetQueueInfoResponse::schema().1, },
🧹 Nitpick comments (9)
src/ingester/parser/state_update.rs (1)
188-191: Confirm widened account_transactions retention and add a targeted testSeeding
kept_account_hasheswithself.in_accountsmeansaccount_transactionsare now kept even when they only reference input/spent accounts that lack any associated tree metadata. This aligns with the comment about input accounts “not having tree info but should be kept,” but it does subtly widen what survives the known-tree filter.To lock this behavior in and avoid regressions, consider adding a small test case where:
StateUpdate::in_accountscontains a hash with no correspondingout_accountsor tree info,account_transactionscontains a transaction for that hash,filter_by_known_treesis called with an empty/irrelevant tree set,and assert that the transaction is still present in
result.state_update.account_transactions.src/ingester/parser/tx_event_parser_v2.rs (1)
18-40: Pubkey conversion helper and LightPubkey mapping look correct; only minor style tweaks possibleThe
to_light_pubkeyhelper cleanly centralizes conversion and its use when buildinglight_program_ids/light_accountsis logically sound; no correctness or ownership issues stand out. If you want to shave a bit of boilerplate, you could lean on function pointers when mapping:- let light_program_ids: Vec<LightPubkey> = - program_ids.iter().map(|p| to_light_pubkey(p)).collect(); + let light_program_ids: Vec<LightPubkey> = + program_ids.iter().map(to_light_pubkey).collect(); - let light_accounts: Vec<Vec<LightPubkey>> = accounts - .into_iter() - .map(|acc_vec| { - acc_vec - .into_iter() - .map(|acc| to_light_pubkey(&acc)) - .collect() - }) - .collect(); + let light_accounts: Vec<Vec<LightPubkey>> = accounts + .into_iter() + .map(|acc_vec| acc_vec.iter().map(to_light_pubkey).collect()) + .collect();Purely optional, current implementation is fine as-is.
src/ingester/persist/indexed_merkle_tree/helpers.rs (3)
93-93: Consider propagating error instead of unwrap on Poseidon initialization.
Poseidon::new_circom(2).unwrap()could theoretically panic if the Poseidon initialization fails. While this is unlikely for a hardcoded arity of 2, consider using?with error conversion for defensive robustness, matching the error handling pattern used for subsequent operations.- let mut poseidon = Poseidon::<Fr>::new_circom(2).unwrap(); + let mut poseidon = Poseidon::<Fr>::new_circom(2) + .map_err(|e| IngesterError::ParserError(format!("Failed to init Poseidon: {}", e)))?;Note: The same pattern exists in
compute_range_node_hash_v1at line 74.
117-130: Redundant import inside function scope.The import
use light_hasher::bigint::bigint_to_be_bytes_array;on line 118 is redundant since it's already imported at the module level on line 8.pub fn get_zeroeth_exclusion_range_v1(tree: Vec<u8>) -> indexed_trees::Model { - use light_hasher::bigint::bigint_to_be_bytes_array; - indexed_trees::Model {
132-145: Redundant import inside function scope.The import
use light_hasher::bigint::bigint_to_be_bytes_array;on line 133 is also redundant since it's already imported at the module level on line 8.pub fn get_top_element(tree: Vec<u8>) -> indexed_trees::Model { - use light_hasher::bigint::bigint_to_be_bytes_array; - indexed_trees::Model {src/api/method/get_queue_info.rs (1)
142-145: Silent fallback for missing queue mapping may hide data inconsistency.When
tree_to_queue.get(&tree_bytes)returnsNone, the code falls back tovec![0u8; 32], which produces a base58 string of all zeros. This could hide a data integrity issue where tree metadata lacks a queue pubkey.Consider logging a warning or returning an error if a tree is expected to have a queue mapping but doesn't.
let queue_bytes = tree_to_queue .get(&tree_bytes) .cloned() - .unwrap_or_else(|| vec![0u8; 32]); + .unwrap_or_else(|| { + tracing::warn!("Missing queue pubkey for tree: {}", bs58::encode(&tree_bytes).into_string()); + vec![0u8; 32] + });src/api/method/get_queue_elements.rs (3)
54-74: Consider consistent serialization behavior for optional fields.The
skip_serializing_ifattribute is only applied to*_zkp_batch_sizefields but not to*_start_indexor*_limitfields. This inconsistency could lead to unexpected JSON serialization behavior where somenullvalues are included and others are omitted.For consistency, either apply
skip_serializing_ifto all optional fields or remove it entirely:pub struct GetQueueElementsRequest { pub tree: Hash, + #[serde(skip_serializing_if = "Option::is_none")] pub output_queue_start_index: Option<u64>, + #[serde(skip_serializing_if = "Option::is_none")] pub output_queue_limit: Option<u16>, #[serde(skip_serializing_if = "Option::is_none")] pub output_queue_zkp_batch_size: Option<u16>, + #[serde(skip_serializing_if = "Option::is_none")] pub input_queue_start_index: Option<u64>, + #[serde(skip_serializing_if = "Option::is_none")] pub input_queue_limit: Option<u16>, #[serde(skip_serializing_if = "Option::is_none")] pub input_queue_zkp_batch_size: Option<u16>, // ... same for address_queue fields }
569-569: Redundant import statement.
create_hash_chain_from_sliceis already imported at line 16 (use light_compressed_account::hash_chain::create_hash_chain_from_slice;), making this innerusestatement unnecessary.fn compute_state_queue_hash_chains( queue_elements: &[QueueElement], queue_type: QueueType, zkp_batch_size: usize, ) -> Result<Vec<Hash>, PhotonApiError> { - use light_compressed_account::hash_chain::create_hash_chain_from_slice; - if zkp_batch_size == 0 || queue_elements.is_empty() {
906-914: Silent failure when storing hash chains to cache.Using
let _ =to discard the result ofstore_hash_chains_batchmeans cache storage failures are silently ignored. While this may be intentional (cache is optional), it could hide persistent issues and make debugging difficult.Consider logging cache storage failures:
if !chains_to_cache.is_empty() { - let _ = queue_hash_cache::store_hash_chains_batch( + if let Err(e) = queue_hash_cache::store_hash_chains_batch( tx, tree_pubkey, QueueType::AddressV2, batch_start_index as u64, chains_to_cache, ) - .await; + .await + { + log::warn!( + "Failed to cache address queue hash chains for tree {}: {}", + tree_pubkey, + e + ); + } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (31)
.gitignore(1 hunks)Cargo.toml(0 hunks)src/api/api.rs(2 hunks)src/api/method/get_batch_address_update_info.rs(0 hunks)src/api/method/get_multiple_new_address_proofs.rs(1 hunks)src/api/method/get_queue_elements.rs(2 hunks)src/api/method/get_queue_info.rs(1 hunks)src/api/method/mod.rs(1 hunks)src/api/rpc_server.rs(1 hunks)src/ingester/fetchers/grpc.rs(1 hunks)src/ingester/parser/indexer_events.rs(1 hunks)src/ingester/parser/state_update.rs(2 hunks)src/ingester/parser/tx_event_parser_v2.rs(3 hunks)src/ingester/persist/indexed_merkle_tree/helpers.rs(4 hunks)src/ingester/persist/indexed_merkle_tree/mod.rs(1 hunks)src/ingester/persist/indexed_merkle_tree/proof.rs(0 hunks)src/ingester/persist/leaf_node.rs(2 hunks)src/ingester/persist/leaf_node_proof.rs(2 hunks)src/ingester/persist/mod.rs(1 hunks)src/ingester/persist/persisted_indexed_merkle_tree.rs(3 hunks)src/ingester/persist/persisted_state_tree.rs(1 hunks)src/ingester/persist/spend.rs(2 hunks)src/main.rs(2 hunks)src/monitor/mod.rs(1 hunks)src/monitor/queue_hash_cache.rs(3 hunks)src/monitor/queue_monitor.rs(2 hunks)src/openapi/mod.rs(2 hunks)src/openapi/specs/api.yaml(2 hunks)tests/integration_tests/batched_address_tree_tests.rs(3 hunks)tests/integration_tests/batched_state_tree_tests.rs(5 hunks)tests/integration_tests/utils.rs(2 hunks)
💤 Files with no reviewable changes (3)
- src/api/method/get_batch_address_update_info.rs
- src/ingester/persist/indexed_merkle_tree/proof.rs
- Cargo.toml
🧰 Additional context used
🧬 Code graph analysis (11)
src/api/method/mod.rs (2)
src/api/api.rs (1)
get_queue_info(276-281)src/api/method/get_queue_info.rs (1)
get_queue_info(101-159)
src/ingester/persist/mod.rs (2)
src/ingester/parser/indexer_events.rs (1)
tree_pubkey(34-39)src/ingester/parser/state_update.rs (1)
tree_pubkey(366-366)
src/ingester/persist/leaf_node.rs (1)
tests/integration_tests/mock_tests.rs (1)
map(1507-1514)
src/monitor/queue_hash_cache.rs (2)
src/ingester/parser/indexer_events.rs (1)
tree_pubkey(34-39)src/ingester/parser/state_update.rs (1)
tree_pubkey(366-366)
src/openapi/mod.rs (2)
src/api/method/get_queue_elements.rs (1)
get_queue_elements(161-260)src/api/api.rs (1)
get_queue_elements(269-274)
src/ingester/persist/indexed_merkle_tree/mod.rs (1)
src/ingester/persist/indexed_merkle_tree/helpers.rs (3)
compute_range_node_hash_v1(73-86)compute_range_node_hash_v2(92-102)get_top_element(132-145)
src/ingester/persist/persisted_indexed_merkle_tree.rs (1)
src/ingester/persist/indexed_merkle_tree/helpers.rs (3)
compute_hash_with_cache(52-69)compute_range_node_hash_v1(73-86)compute_range_node_hash_v2(92-102)
tests/integration_tests/batched_state_tree_tests.rs (1)
tests/integration_tests/utils.rs (1)
setup(230-239)
src/api/method/get_queue_info.rs (2)
src/api/api.rs (2)
new(107-117)get_queue_info(276-281)src/api/method/utils.rs (1)
bs58(567-567)
tests/integration_tests/batched_address_tree_tests.rs (1)
src/api/method/get_queue_elements.rs (1)
get_queue_elements(161-260)
src/api/api.rs (1)
src/api/method/get_queue_info.rs (1)
get_queue_info(101-159)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run tests
🔇 Additional comments (45)
src/ingester/persist/leaf_node_proof.rs (1)
32-37: LGTM! Useful observability addition.The debug logging provides helpful context for troubleshooting proof fetching operations, capturing the number of indices, tree identifier, and current sequence number at an appropriate log level.
src/ingester/persist/persisted_state_tree.rs (1)
188-190: Guard against out-of-bounds slice onEMPTY_SUBTREESand confirm new length contractThis change looks semantically aligned with returning only
tree_heightsubtrees, but it introduces a new runtime risk and an API shape change:
Potential panic:
EMPTY_SUBTREES[..tree_height]will panic iftree_height > EMPTY_SUBTREES.len()(40). Previously the empty-result path always returned the fullEMPTY_SUBTREESand could not panic regardless oftree_height. Unless all call sites strictly guaranteetree_height <= 40, this is a new critical failure mode.Consider adding an explicit bound check (and failing fast with a structured error) or clamping:
if results.is_empty() {
return Ok(EMPTY_SUBTREES[..tree_height].to_vec());
if tree_height > EMPTY_SUBTREES.len() {return Err(PhotonApiError::UnexpectedError(format!("Requested tree_height {} exceeds EMPTY_SUBTREES capacity {}",tree_height,EMPTY_SUBTREES.len())));} }return Ok(EMPTY_SUBTREES[..tree_height].to_vec());Alternatively, if you know `tree_height` is always `<= MAX_HEIGHT`, a `debug_assert!(tree_height <= EMPTY_SUBTREES.len());` here (and/or at the call sites) would at least encode that invariant.
- Changed return length for empty results:
Callers that previously assumed a fixed length 40 vector fromget_subtreeson empty DB state will now receive a shorter vector sized totree_height. That matches the new per-tree, height-bounded semantics described in the PR, but it's worth double-checking all consumers to ensure they no longer hard-codeEMPTY_SUBTREES.len()and instead rely ontree_height/ returned length.src/monitor/queue_hash_cache.rs (2)
3-3: LGTM: Import change supports generic refactoring.The import of
ConnectionTraitcorrectly supports the generic parameter changes in the function signatures below.
14-23: No API compatibility issues identified — all call sites are already compatible with the new generic signatures.The refactoring from
db: &DatabaseConnectiontodb: &C where C: ConnectionTraitis backward compatible and an improvement, not a breaking change. BothDatabaseConnectionandDatabaseTransactionimplementConnectionTrait. All call sites in the codebase already pass compatible types:
queue_monitor.rspasses&DatabaseConnectionget_queue_elements.rspasses&DatabaseTransactionThis generic approach enables broader flexibility (transactions, connection pools, testing scenarios) while maintaining compatibility with existing code.
src/monitor/mod.rs (1)
1-1: LGTM: Module visibility change exposes queue hash cache functionality.Making
queue_hash_cachepublic aligns with the broader PR objective to expose queue/hash-cache functionality to higher-level components and enables the generic DB interface usage pattern.src/monitor/queue_monitor.rs (1)
8-8: LGTM: Appropriate log-level adjustment reduces noise.The change from
warn!todebug!for incomplete batch messages is correct—incomplete batches are expected during normal operation. The added comments provide valuable context, and removing the unusedwarnimport keeps the code clean.Also applies to: 353-362
src/ingester/parser/indexer_events.rs (1)
5-5: Import reordering is fineThe reordered
light_event::eventimport keeps semantics unchanged and matches the surrounding import style. No issues here.src/ingester/parser/state_update.rs (1)
10-10: Import placement looks consistentBringing
BatchNullifyContextandNewAddressin here matches their usage inStateUpdateand keeps imports grouped by domain. No behavior concerns.src/ingester/persist/leaf_node.rs (1)
208-212: LGTM! Good observability practice.The success logging provides useful feedback for debugging and monitoring persistence operations.
src/ingester/parser/tx_event_parser_v2.rs (1)
45-88:pubkey_arraypassthrough is consistent with other forwarded fieldsForwarding
public_transaction_event.event.pubkey_arraydirectly intoPublicTransactionEventmatches the pattern used for other scalar/event fields and keeps the ingester’s view aligned with the light-event type; no additional transformation or cloning seems necessary here..gitignore (1)
16-17: LGTM!Adding
proto/**/*.binto ignore generated proto binary files is a standard practice.src/ingester/fetchers/grpc.rs (1)
303-306: LGTM!The multi-line formatting improves readability without changing behavior.
src/ingester/persist/mod.rs (1)
182-186: LGTM!The enhanced error message provides clearer guidance by including the tree pubkey and explaining that tree metadata must be synced before indexing.
tests/integration_tests/utils.rs (2)
34-34: LGTM!The import alias properly avoids naming conflicts.
241-256: Clarify interaction with PHOTON_SKIP_ISOLATION_LEVEL flag.The code sets the session isolation level to READ COMMITTED, but line 182 in the same file sets
PHOTON_SKIP_ISOLATION_LEVEL=trueto skip isolation level settings in tests. Please verify whether this isolation level configuration is intended to take effect or if it's overridden by the environment variable.src/api/method/mod.rs (1)
24-24: LGTM!The module replacement aligns with the PR's objective to migrate from
get_batch_address_update_infotoget_queue_info.src/ingester/persist/persisted_indexed_merkle_tree.rs (3)
7-9: LGTM!The import updates correctly bring in version-specific hash computation functions to support v1 and v2 tree types.
50-59: LGTM!The switch to
compute_range_node_hash_v2for non-V1 tree types is consistent with the version-specific hash computation strategy (V1 trees use v1 hash at line 42, while V2+ trees use v2 hash here).
495-511: LGTM!The test correctly uses
compute_range_node_hash_v2to match the updated production code for v2 tree types.src/openapi/mod.rs (2)
23-25: LGTM!The import statement correctly includes the new queue data types required for the updated API surface.
86-89: LGTM!The OpenAPI component schemas correctly include the new queue data types, aligning with the API migration to the queue-based approach.
src/ingester/persist/indexed_merkle_tree/mod.rs (1)
8-12: LGTM - Public API updated to expose versioned hash functions.The re-exports correctly expose both
compute_range_node_hash_v1andcompute_range_node_hash_v2to support both AddressV1 (3-field hash) and AddressV2 (2-field hash) indexed Merkle tree types.src/api/rpc_server.rs (1)
197-201: LGTM - New RPC method follows established pattern.The
getQueueInfomethod registration is consistent with other RPC methods in the file, properly parsing the payload and delegating to the API layer with appropriate error conversion.tests/integration_tests/batched_address_tree_tests.rs (4)
9-9: LGTM - Import added for new request type.
166-177: LGTM - Test properly exercises new per-queue API.The request correctly sets only
address_queue_limitwhile leaving other queue-type parameters asNone, which aligns with the API design that allows requesting specific queue types independently.
181-204: LGTM - Assertions updated for new response structure.The test correctly extracts
address_queuefrom the response and accesses addresses viaelement.0.to_bytes(), matching the newAddressQueueDatastructure.
225-247: LGTM - Post-update verification uses correct API pattern.The verification correctly handles the optional
address_queuewithmap().unwrap_or_default()for safe extraction.src/ingester/persist/indexed_merkle_tree/helpers.rs (3)
14-32: LGTM - Tree type dispatch correctly routes to versioned hash functions.The dispatch logic properly maps
AddressV1tocompute_range_node_hash_v1andAddressV2tocompute_range_node_hash_v2, with clear inline comments explaining the difference in hash construction.
71-86: Documentation clarifies v1 hash construction.Good addition of the doc comment explaining that v1 uses the 3-field Poseidon hash including
next_index.
88-102: Well-documented v2 hash function with commit reference.The documentation correctly explains that
next_indexis stored but not included in the hash for v2, with a helpful commit reference for traceability.src/api/method/get_queue_info.rs (3)
12-17: LGTM - Request structure with optional tree filter.The
deny_unknown_fieldsattribute provides good API contract enforcement. The optionaltreesparameter allows both filtered and unfiltered queries.
35-99: LGTM - Queue size fetching handles multiple tree types correctly.The function correctly:
- Filters by StateV2 and AddressV2 tree types
- For StateV2: counts both input queue (nullifier) and output queue entries
- For AddressV2: counts address queue entries
- Uses composite key
(tree_pubkey, queue_type)to handle multiple queues per tree
101-158: LGTM - Main function implementation is well-structured.The function properly:
- Validates and parses input pubkeys
- Fetches queue sizes with optional filtering
- Joins with tree metadata to get queue pubkeys
- Returns the context slot for consistency
src/api/api.rs (2)
80-82: LGTM - Import added for new queue info types.
276-281: LGTM - API method follows established pattern.The
get_queue_infomethod correctly delegates to the module function with proper connection reference passing.tests/integration_tests/batched_state_tree_tests.rs (4)
242-267: LGTM - Separate queue fetches with appropriate per-queue limits.The test correctly makes separate API calls for output queue (limit=100) and input queue (limit=100), matching the new per-queue pagination design.
314-363: LGTM - Nullify event handling updated for new response structure.The code correctly:
- Extracts queue lengths using the nested optional structure with
map_or- Safely handles the case where
pre_input_queuemight be None- Uses
pre_input_queue.leavesandpre_input_queue.leaf_indicesfor batch processing
367-418: LGTM - Append event handling correctly uses new queue structure.The code properly extracts
pre_output_queueand accessesleavesandleaf_indicesfor the batch append verification.
478-511: LGTM - Final assertions verify empty queues after processing.The assertions correctly check that both output and input queues are empty after all batch events are processed, using the new nested response structure.
src/api/method/get_queue_elements.rs (6)
30-42: LGTM!The node index encoding is well-documented and correctly implemented. The use of
debug_assertis appropriate for internal validation during development.
161-260: LGTM!The main function is well-structured with:
- Proper validation that at least one queue is requested
- Correct transaction handling with isolation
- Clean separation of output, input, and address queue fetching
- Proper merging of state queue proofs
262-310: LGTM!The merge logic correctly collects proofs from both queues and deduplicates nodes. Since both queues are fetched from the same tree in the same transaction, the roots should be consistent.
1000-1005: LGTM!Clean helper function with proper error handling.
88-150: LGTM!The data structures are well-designed with:
- Appropriate use of
Defaultderive for structures that may be empty- Consistent serde attributes (
deny_unknown_fields,rename_all = "camelCase")- Conditional serialization with
skip_serializing_iffor optional/empty fields
955-958:is_multiple_ofis now stable in Rust 1.87.0 and later.The
is_multiple_ofmethod was stabilized in Rust 1.87.0 (May 15, 2025) and no longer requires nightly Rust. The code is compatible with stable Rust, so no changes are needed.
| tree: | ||
| $ref: '#/components/schemas/Hash' | ||
| outputQueueStartIndex: | ||
| type: integer | ||
| format: uint16 | ||
| format: uint64 | ||
| nullable: true | ||
| minimum: 0 | ||
| queueType: | ||
| outputQueueLimit: | ||
| type: integer | ||
| format: uint8 | ||
| format: uint16 | ||
| nullable: true | ||
| minimum: 0 | ||
| startOffset: | ||
| inputQueueStartIndex: | ||
| type: integer | ||
| format: uint64 | ||
| nullable: true | ||
| minimum: 0 | ||
| tree: | ||
| $ref: '#/components/schemas/Hash' | ||
| inputQueueLimit: | ||
| type: integer | ||
| format: uint16 | ||
| nullable: true | ||
| minimum: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Request schema is missing several parameters from Rust implementation.
The Rust GetQueueElementsRequest includes additional parameters that are not documented in the OpenAPI spec:
address_queue_start_index,address_queue_limit,address_queue_zkp_batch_sizeoutput_queue_zkp_batch_size,input_queue_zkp_batch_size
Add the missing parameters to align with the Rust implementation:
properties:
tree:
$ref: '#/components/schemas/Hash'
outputQueueStartIndex:
type: integer
format: uint64
nullable: true
minimum: 0
outputQueueLimit:
type: integer
format: uint16
nullable: true
minimum: 0
+ outputQueueZkpBatchSize:
+ type: integer
+ format: uint16
+ nullable: true
+ minimum: 0
inputQueueStartIndex:
type: integer
format: uint64
nullable: true
minimum: 0
inputQueueLimit:
type: integer
format: uint16
nullable: true
minimum: 0
+ inputQueueZkpBatchSize:
+ type: integer
+ format: uint16
+ nullable: true
+ minimum: 0
+ addressQueueStartIndex:
+ type: integer
+ format: uint64
+ nullable: true
+ minimum: 0
+ addressQueueLimit:
+ type: integer
+ format: uint16
+ nullable: true
+ minimum: 0
+ addressQueueZkpBatchSize:
+ type: integer
+ format: uint16
+ nullable: true
+ minimum: 0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| tree: | |
| $ref: '#/components/schemas/Hash' | |
| outputQueueStartIndex: | |
| type: integer | |
| format: uint16 | |
| format: uint64 | |
| nullable: true | |
| minimum: 0 | |
| queueType: | |
| outputQueueLimit: | |
| type: integer | |
| format: uint8 | |
| format: uint16 | |
| nullable: true | |
| minimum: 0 | |
| startOffset: | |
| inputQueueStartIndex: | |
| type: integer | |
| format: uint64 | |
| nullable: true | |
| minimum: 0 | |
| tree: | |
| $ref: '#/components/schemas/Hash' | |
| inputQueueLimit: | |
| type: integer | |
| format: uint16 | |
| nullable: true | |
| minimum: 0 | |
| tree: | |
| $ref: '#/components/schemas/Hash' | |
| outputQueueStartIndex: | |
| type: integer | |
| format: uint64 | |
| nullable: true | |
| minimum: 0 | |
| outputQueueLimit: | |
| type: integer | |
| format: uint16 | |
| nullable: true | |
| minimum: 0 | |
| outputQueueZkpBatchSize: | |
| type: integer | |
| format: uint16 | |
| nullable: true | |
| minimum: 0 | |
| inputQueueStartIndex: | |
| type: integer | |
| format: uint64 | |
| nullable: true | |
| minimum: 0 | |
| inputQueueLimit: | |
| type: integer | |
| format: uint16 | |
| nullable: true | |
| minimum: 0 | |
| inputQueueZkpBatchSize: | |
| type: integer | |
| format: uint16 | |
| nullable: true | |
| minimum: 0 | |
| addressQueueStartIndex: | |
| type: integer | |
| format: uint64 | |
| nullable: true | |
| minimum: 0 | |
| addressQueueLimit: | |
| type: integer | |
| format: uint16 | |
| nullable: true | |
| minimum: 0 | |
| addressQueueZkpBatchSize: | |
| type: integer | |
| format: uint16 | |
| nullable: true | |
| minimum: 0 |
🤖 Prompt for AI Agents
In src/openapi/specs/api.yaml around lines 3743 to 3764 the
GetQueueElementsRequest schema is missing parameters present in the Rust
implementation; add the following nullable integer fields to the schema with
appropriate format and minimum values: address_queue_start_index (uint64,
minimum 0), address_queue_limit (uint16, minimum 0),
address_queue_zkp_batch_size (uint32 or suitable integer format, minimum 0),
output_queue_zkp_batch_size (uint32, minimum 0), and input_queue_zkp_batch_size
(uint32, minimum 0); ensure names match the Rust fields, mark them nullable if
the Rust type is optional, and place them alongside the existing
startIndex/limit fields so the OpenAPI spec aligns with the Rust
GetQueueElementsRequest.
| outputQueueElements: | ||
| type: array | ||
| nullable: true | ||
| items: | ||
| $ref: '#/components/schemas/GetQueueElementsResponseValue' | ||
| outputQueueIndex: | ||
| type: integer | ||
| format: uint64 | ||
| nullable: true | ||
| minimum: 0 | ||
| value: | ||
| inputQueueElements: | ||
| type: array | ||
| nullable: true | ||
| items: | ||
| $ref: '#/components/schemas/GetQueueElementsResponseValue' | ||
| inputQueueIndex: | ||
| type: integer | ||
| format: uint64 | ||
| nullable: true | ||
| minimum: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Response schema doesn't match Rust implementation.
The OpenAPI response schema defines outputQueueElements, outputQueueIndex, inputQueueElements, and inputQueueIndex as top-level fields, but the Rust GetQueueElementsResponse structure uses state_queue: Option<StateQueueData> and address_queue: Option<AddressQueueData> instead. This will cause API documentation to be inconsistent with actual API behavior.
The response schema should reflect the actual Rust structure:
result:
type: object
required:
- context
properties:
context:
$ref: '#/components/schemas/Context'
- outputQueueElements:
- type: array
- nullable: true
- items:
- $ref: '#/components/schemas/GetQueueElementsResponseValue'
- outputQueueIndex:
- type: integer
- format: uint64
- nullable: true
- minimum: 0
- inputQueueElements:
- type: array
- nullable: true
- items:
- $ref: '#/components/schemas/GetQueueElementsResponseValue'
- inputQueueIndex:
- type: integer
- format: uint64
- nullable: true
- minimum: 0
+ stateQueue:
+ $ref: '#/components/schemas/StateQueueData'
+ nullable: true
+ addressQueue:
+ $ref: '#/components/schemas/AddressQueueData'
+ nullable: trueAdditionally, you'll need to add StateQueueData, OutputQueueData, InputQueueData, and AddressQueueData schemas to the components section.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/openapi/specs/api.yaml around lines 3802 to 3821, the response schema
currently exposes outputQueueElements/outputQueueIndex and
inputQueueElements/inputQueueIndex as top-level fields but the Rust type returns
state_queue: Option<StateQueueData> and address_queue: Option<AddressQueueData>;
update the response object to match the Rust structure by replacing those four
top-level fields with two optional objects: state_queue and address_queue, where
state_queue contains output: OutputQueueData and input: InputQueueData (or
similarly named properties matching Rust field names) and address_queue contains
the address-specific queue data; also add component schema definitions for
StateQueueData, OutputQueueData, InputQueueData, and AddressQueueData under
components/schemas with fields/types that mirror the Rust structs (arrays of
GetQueueElementsResponseValue, uint64 indexes nullable with minimum 0, etc.).
feat: add getQueueInfo RPC method feat: add grpc photon.QueueService/SubscribeQueueUpdates update ci workflow feat: replace polling queue updates during with event-based on ingestion cleanup format cleanup fix: tracking account_transaction for input accounts feat: distinct output and input queues in the `get_queue_elements` cleanup
src/api/method/get_queue_info_BASE_16635.rs
…renamed get_queue_elements_v2 to get_queue_elements. - Deleted the `get_queue_elements_v2.rs` file, which contained the implementation for the `get_queue_elements_v2` API method. - Removed the module declaration for `get_queue_elements_v2` in `mod.rs`. - Eliminated the registration of the `getQueueElementsV2` RPC method in `rpc_server.rs`. - Cleaned up unused imports and references to `get_queue_elements_v2` in the OpenAPI module.
…d inserting parent hashes in `nodes_map`
6fc9d0a to
03d2c74
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
src/openapi/specs/api.yaml (2)
3739-3765:GetQueueElementsRequestOpenAPI schema still missing fields present in Rust request.The Rust
GetQueueElementsRequestincludes additional optional parameters that aren’t documented here:
output_queue_zkp_batch_size: Option<u16>input_queue_zkp_batch_size: Option<u16>address_queue_start_index: Option<u64>address_queue_limit: Option<u16>address_queue_zkp_batch_size: Option<u16>To keep the spec aligned with the API, please add these properties (nullable integers with the appropriate formats and
minimum: 0) next to the existing per-queue start/limit fields, e.g.:properties: tree: $ref: '#/components/schemas/Hash' outputQueueStartIndex: type: integer format: uint64 nullable: true minimum: 0 outputQueueLimit: type: integer format: uint16 nullable: true minimum: 0 + outputQueueZkpBatchSize: + type: integer + format: uint16 + nullable: true + minimum: 0 inputQueueStartIndex: type: integer format: uint64 nullable: true minimum: 0 inputQueueLimit: type: integer format: uint16 nullable: true minimum: 0 + inputQueueZkpBatchSize: + type: integer + format: uint16 + nullable: true + minimum: 0 + addressQueueStartIndex: + type: integer + format: uint64 + nullable: true + minimum: 0 + addressQueueLimit: + type: integer + format: uint16 + nullable: true + minimum: 0 + addressQueueZkpBatchSize: + type: integer + format: uint16 + nullable: true + minimum: 0
3796-3821:GetQueueElementsResponseOpenAPI shape does not match RustGetQueueElementsResponse.The Rust response type now exposes:
context: Contextstate_queue: Option<StateQueueData>address_queue: Option<AddressQueueData>but the OpenAPI response here still documents:
outputQueueElements/outputQueueIndexinputQueueElements/inputQueueIndex- and references
GetQueueElementsResponseValuein components.This will confuse integrators and diverges from the new queue data model. Please update the response to match the Rust type and use the new component schemas you added (
StateQueueData,OutputQueueData,InputQueueData,AddressQueueData), e.g.:result: type: object required: - context properties: context: $ref: '#/components/schemas/Context' - outputQueueElements: - type: array - nullable: true - items: - $ref: '#/components/schemas/GetQueueElementsResponseValue' - outputQueueIndex: - type: integer - format: uint64 - nullable: true - minimum: 0 - inputQueueElements: - type: array - nullable: true - items: - $ref: '#/components/schemas/GetQueueElementsResponseValue' - inputQueueIndex: - type: integer - format: uint64 - nullable: true - minimum: 0 + stateQueue: + allOf: + - $ref: '#/components/schemas/StateQueueData' + nullable: true + addressQueue: + allOf: + - $ref: '#/components/schemas/AddressQueueData' + nullable: trueOnce you do that,
GetQueueElementsResponseValueshould no longer be referenced; if it’s truly unused after the refactor, consider removing that schema fromcomponentsto keep the spec clean.Also applies to: 4761-4794
src/api/method/get_queue_elements.rs (1)
312-462: Avoidunwrap()onHash::newand clarify ZKP batch truncation semantics.Two points in
fetch_queue:
- Potential panic on malformed hashes (repeat of earlier review):
let account_hashes: Vec<Hash> = queue_elements .iter() .map(|e| Hash::new(e.hash.as_slice()).unwrap()) .collect();If any row has a non-32-byte
hash, this will panic. You already handle malformedtx_hash/nullifierwith proper error propagation; doing the same here would make this path consistent and safer. For example:- let account_hashes: Vec<Hash> = queue_elements - .iter() - .map(|e| Hash::new(e.hash.as_slice()).unwrap()) - .collect(); + let account_hashes: Vec<Hash> = queue_elements + .iter() + .map(|e| { + Hash::new(e.hash.as_slice()).map_err(|err| { + PhotonApiError::UnexpectedError(format!( + "Invalid account hash (leaf_index={}): {}", + e.leaf_index, err + )) + }) + }) + .collect::<Result<Vec<_>, _>>()?;
- Behavior when there are fewer elements than a full ZKP batch:
When
indices.len() < zkp_batch_size,full_batches == 0andallowed == 0, so you immediately return an empty/default queue (no elements, no proofs). If that’s the intended contract (“only full batches are ever returned”), it’d be good to document it clearly; otherwise, you may want to either:
- Return the partial batch without ZKP-related data, or
- Tighten the API to make it explicit that callers must request limits that are multiples of the batch size.
Also, note that
start_index: u64is cast toi64for DB filtering; if you ever expect queue indices approachingi64::MAX, some validation (or a checked cast) would avoid surprising wraparound.
🧹 Nitpick comments (6)
src/ingester/persist/indexed_merkle_tree/helpers.rs (1)
8-8: Remove redundant local imports.
bigint_to_be_bytes_arrayis imported at module level (line 8), but is also imported locally inget_zeroeth_exclusion_range_v1(line 118) andget_top_element(line 133). The local imports are redundant.pub fn get_zeroeth_exclusion_range_v1(tree: Vec<u8>) -> indexed_trees::Model { - use light_hasher::bigint::bigint_to_be_bytes_array; - indexed_trees::Model {pub fn get_top_element(tree: Vec<u8>) -> indexed_trees::Model { - use light_hasher::bigint::bigint_to_be_bytes_array; - indexed_trees::Model {Also applies to: 118-118, 133-133
tests/integration_tests/utils.rs (1)
243-257:SET SESSIONonly affects one connection; consider configuring per-connection or adjust the commentThis pattern:
let pool = PgPoolOptions::new() .min_connections(1) .connect_with(options) .await .unwrap(); sqlx::query("SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED") .execute(&pool) .await .unwrap();runs
SET SESSION ...on a single connection checked out from the pool. If the pool ever opens additional connections (e.g., under concurrent load), those new connections will still use the server’s default isolation level, so the comment “for all connections in the pool” is not strictly true.For tests where you likely stay within a single connection this is probably fine, but to make the behavior robust and future‑proof you could either:
- Configure this per-connection using
PgPoolOptions::after_connect, or- Keep the current code but relax the comment to avoid implying all present/future pool connections are guaranteed to be
READ COMMITTED.Conceptual example using
after_connect:pub async fn setup_pg_pool(database_url: String) -> PgPool { let options: PgConnectOptions = database_url.parse().unwrap(); - let pool = PgPoolOptions::new() - .min_connections(1) - .connect_with(options) - .await - .unwrap(); - - // Set default isolation level to READ COMMITTED for all connections in the pool - // This ensures each statement sees the latest committed data - sqlx::query("SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED") - .execute(&pool) - .await - .unwrap(); - - pool + PgPoolOptions::new() + .min_connections(1) + .after_connect(|conn, _meta| { + Box::pin(async move { + sqlx::query( + "SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED", + ) + .execute(conn) + .await?; + Ok(()) + }) + }) + .connect_with(options) + .await + .unwrap() }You may need to tweak the exact
after_connectsignature to match yoursqlxversion, but the idea is to ensure every connection created by the pool gets the same isolation‑level configuration.src/ingester/parser/state_update.rs (1)
188-191: Includingin_accountsinkept_account_hasheslooks correct but deserves a regression testExtending
kept_account_hasheswithself.in_accountsensuresaccount_transactionsare now retained for spent/input accounts even when those accounts don’t have associated tree metadata in this update. That matches the comment and fixes the previous blind spot.It would be valuable to add a focused test where:
- an account hash appears only in
in_accounts(no correspondingout_accounts), and- there’s an
AccountTransactionfor that hash,and verify that
filter_by_known_treeskeeps theAccountTransactionafter filtering.src/ingester/parser/tx_event_parser_v2.rs (1)
87-87: Movingpubkey_arrayintoPublicTransactionEventis safe and slightly more efficient
pubkey_array: public_transaction_event.event.pubkey_arraynow moves the field instead of cloning/indirect access. Sincepublic_transaction_event.eventisn’t used after this and only other top‑level fields ofpublic_transaction_eventare read, this is borrow‑checker safe and avoids an extra allocation/copy.src/api/method/get_queue_elements.rs (2)
235-251: Inconsistent handling of 0 ZKP batch size hints for address queues and cache semantics.For state queues, you treat a
0ZKP batch size hint as “use the default”:let zkp_batch_size = zkp_batch_size_hint .filter(|v| *v > 0) .unwrap_or(DEFAULT_ZKP_BATCH_SIZE as u16) as usize;For address queues,
address_queue_zkp_batch_size.unwrap_or(DEFAULT_ADDRESS_ZKP_BATCH_SIZE as u16)propagatesSome(0)as0, andfetch_address_queue_v2then fails with a validation error ("Address queue ZKP batch size must be greater than zero"). That asymmetry could surprise callers using a shared client config.Consider either:
- Treating 0 as “use default” for address queues as well, or
- Validating
address_queue_zkp_batch_sizeat the API boundary and rejecting 0 consistently for all queues.Separately, in
fetch_address_queue_v2you’ll reuse any cached hash chains whenevercached.len() >= expected_batch_count, including the case whereexpected_batch_count == 0but cached entries exist. If that situation is possible, you may want an explicitexpected_batch_count > 0guard before preferring cache, to avoid returning unrelated cached chains when the current request doesn’t actually form any full batches.Also applies to: 640-917
937-998: Use% 2 == 0for consistency with other code in this file; verify encoding convention aligns with documented levels.In
deduplicate_nodes_from_refsyou use:let sibling_pos = if pos.is_multiple_of(2) { … }; … let parent_hash = if pos.is_multiple_of(2) { … } else { … };While
is_multiple_of(2)is available onu64in stable Rust 1.74+, elsewhere in this file (e.g., the address-queue proof loop) you use the more idiomaticpos % 2 == 0pattern. For consistency, switch to:- for (level, sibling_hash) in proof_ctx.proof.iter().enumerate() { - let sibling_pos = if pos.is_multiple_of(2) { + for (level, sibling_hash) in proof_ctx.proof.iter().enumerate() { + let sibling_pos = if pos % 2 == 0 { pos + 1 } else { pos - 1 }; … - let parent_hash = if pos.is_multiple_of(2) { + let parent_hash = if pos % 2 == 0 { Poseidon::hashv(&[¤t_hash.0, &sibling_hash.0]) } else { Poseidon::hashv(&[&sibling_hash.0, ¤t_hash.0]) };Additionally, verify that the encoding convention is correct: the doc comment for
encode_node_indexstates "Leveltree_height-1= root," but in this function you store parents at(level + 1), which would place the root at leveltree_height. Confirm this matches the intended behavior or update the documentation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (31)
.gitignore(1 hunks)Cargo.toml(0 hunks)src/api/api.rs(2 hunks)src/api/method/get_batch_address_update_info.rs(0 hunks)src/api/method/get_multiple_new_address_proofs.rs(1 hunks)src/api/method/get_queue_elements.rs(2 hunks)src/api/method/get_queue_info.rs(1 hunks)src/api/method/mod.rs(1 hunks)src/api/rpc_server.rs(1 hunks)src/ingester/fetchers/grpc.rs(1 hunks)src/ingester/parser/indexer_events.rs(1 hunks)src/ingester/parser/state_update.rs(2 hunks)src/ingester/parser/tx_event_parser_v2.rs(3 hunks)src/ingester/persist/indexed_merkle_tree/helpers.rs(4 hunks)src/ingester/persist/indexed_merkle_tree/mod.rs(1 hunks)src/ingester/persist/indexed_merkle_tree/proof.rs(0 hunks)src/ingester/persist/leaf_node.rs(2 hunks)src/ingester/persist/leaf_node_proof.rs(2 hunks)src/ingester/persist/mod.rs(1 hunks)src/ingester/persist/persisted_indexed_merkle_tree.rs(3 hunks)src/ingester/persist/persisted_state_tree.rs(1 hunks)src/ingester/persist/spend.rs(2 hunks)src/main.rs(2 hunks)src/monitor/mod.rs(1 hunks)src/monitor/queue_hash_cache.rs(3 hunks)src/monitor/queue_monitor.rs(2 hunks)src/openapi/mod.rs(2 hunks)src/openapi/specs/api.yaml(2 hunks)tests/integration_tests/batched_address_tree_tests.rs(3 hunks)tests/integration_tests/batched_state_tree_tests.rs(5 hunks)tests/integration_tests/utils.rs(2 hunks)
💤 Files with no reviewable changes (3)
- src/ingester/persist/indexed_merkle_tree/proof.rs
- src/api/method/get_batch_address_update_info.rs
- Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (17)
- src/api/method/get_multiple_new_address_proofs.rs
- src/ingester/persist/persisted_state_tree.rs
- src/monitor/queue_hash_cache.rs
- src/ingester/fetchers/grpc.rs
- src/main.rs
- .gitignore
- src/ingester/persist/leaf_node.rs
- tests/integration_tests/batched_address_tree_tests.rs
- src/api/rpc_server.rs
- src/ingester/persist/mod.rs
- src/monitor/queue_monitor.rs
- src/ingester/persist/indexed_merkle_tree/mod.rs
- src/api/api.rs
- src/ingester/persist/leaf_node_proof.rs
- src/ingester/parser/indexer_events.rs
- src/api/method/mod.rs
- src/ingester/persist/spend.rs
🧰 Additional context used
🧬 Code graph analysis (4)
src/ingester/persist/persisted_indexed_merkle_tree.rs (1)
src/ingester/persist/indexed_merkle_tree/helpers.rs (6)
compute_hash_with_cache(52-69)compute_range_node_hash_v1(73-86)compute_range_node_hash_v2(92-102)get_top_element(132-145)get_zeroeth_exclusion_range(104-115)get_zeroeth_exclusion_range_v1(117-130)
src/openapi/mod.rs (2)
src/api/method/get_queue_elements.rs (1)
get_queue_elements(161-260)src/api/api.rs (1)
get_queue_elements(269-274)
src/api/method/get_queue_info.rs (2)
src/api/api.rs (1)
get_queue_info(276-281)src/api/method/utils.rs (1)
bs58(567-567)
tests/integration_tests/batched_state_tree_tests.rs (1)
tests/integration_tests/utils.rs (1)
setup(230-239)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run tests
🔇 Additional comments (15)
src/monitor/mod.rs (1)
1-1: Visibility change ofqueue_hash_cachelooks appropriateMaking
queue_hash_cachepublic is consistent with the idea of reusing the queue cache outsidequeue_monitorand fits the PR summary about generalized queue handling. Just be aware this makesmonitor::queue_hash_cachepart of the crate’s public surface (ifmonitoris re-exported); if you want to keep the API narrower long term, you could instead re-export only specific types/functions from a higher-level module later.src/ingester/persist/persisted_indexed_merkle_tree.rs (3)
6-10: LGTM!The imports are correctly updated to include both v1 and v2 hash computation functions, matching the refactored helpers module.
39-60: LGTM!The branching logic correctly applies v1 hashing for
AddressV1trees and v2 hashing for all other tree types. This is consistent with the helpers module whereAddressV2uses the 2-field hash.
495-511: LGTM!The test correctly uses
compute_range_node_hash_v2withget_zeroeth_exclusion_range(the V2 variant), maintaining consistency with the production code refactoring.src/ingester/persist/indexed_merkle_tree/helpers.rs (4)
14-33: LGTM!The dispatch logic is clear with helpful documentation explaining the difference between V1 (3-field hash) and V2 (2-field hash). The commit reference in the comment provides good traceability for the design decision.
71-86: LGTM!Clear documentation explaining the 3-field Poseidon hash structure for AddressV1. The implementation correctly pads
next_indexto 32 bytes (big-endian) before hashing.
88-102: LGTM!Well-documented new function with clear explanation of the design rationale. The implementation correctly uses
Poseidon::<Fr>::new_circom(2)for the 2-field hash (value, next_value).
104-145: LGTM!The helper functions are consistently updated to use
bigint_to_be_bytes_arrayfor constructingnext_valueandvaluefields. Theunwrap()calls are safe sinceHIGHEST_ADDRESS_PLUS_ONEis a known constant guaranteed to fit in 32 bytes.tests/integration_tests/utils.rs (1)
34-34: Import alias forSolanaAccountis clear and avoids type-name collisionsMoving the
solana_account::Accountimport up and aliasing it asSolanaAccountkeeps usages unambiguous next to the existingAccount/AccountV2types and removes duplication further down the file. Looks good.src/ingester/parser/state_update.rs (1)
10-10: Import reordering forNewAddressis safeThis just adjusts the
light_event::eventimport ordering (and still matches theimpl From<NewAddress> for AddressQueueUpdate>below), so there’s no behavioral change here.src/ingester/parser/tx_event_parser_v2.rs (2)
14-14: Updatedevent_from_light_transactionimport is consistent with usageThe new
light_event::parse::event_from_light_transactionimport matches the call at line 40 and doesn’t change behavior, assuming the function signature is unchanged in the dependency.
28-38: Refactor of Light pubkey conversion for programs/accounts is behavior‑preservingThe reshaped
light_program_idsandlight_accountsmappings are equivalent to the previous logic (iterating and converting eachPubkeythroughto_light_pubkey), just with clearer formatting. No correctness or allocation changes beyond style.src/openapi/mod.rs (1)
23-25: New queue data schemas correctly wired into OpenAPI components.The addition of
AddressQueueData,InputQueueData,OutputQueueData, andStateQueueDatato the OpenAPI components list matches the new Rust types inget_queue_elementsand looks consistent.Also applies to: 84-90
tests/integration_tests/batched_state_tree_tests.rs (1)
153-168: Updated integration test correctly targets new queue API shape.The test’s use of
GetQueueElementsRequest(with explicit per-queue limits) and the newstate_queue.{output_queue,input_queue}accessors looks consistent with the refactoredget_queue_elementsresponse. Themap_or(0, |v| v.leaves.len())pattern also makes the length assertions robust when queues are empty.Also applies to: 238-310, 311-419, 461-511
src/api/method/get_queue_elements.rs (1)
54-86: New request/response shapes and high-level flow look coherent.The expanded
GetQueueElementsRequest(per-queue start/limit/ZKP hints) andGetQueueElementsResponse(context + optionalstate_queue/address_queue) align with the refactored behavior and how the integration test consumes them. The transactional flow inget_queue_elements—fetching output/input queues, merging proofs, and optionally loading the address queue—looks logically sound.Also applies to: 161-260
| let queue_sizes = fetch_queue_sizes(db, tree_filter).await?; | ||
|
|
||
| let tree_pubkeys: Vec<Vec<u8>> = queue_sizes | ||
| .keys() | ||
| .map(|(tree, _)| tree.clone()) | ||
| .collect::<std::collections::HashSet<_>>() | ||
| .into_iter() | ||
| .collect(); | ||
|
|
||
| let tree_metadata_list = tree_metadata::Entity::find() | ||
| .filter(tree_metadata::Column::TreePubkey.is_in(tree_pubkeys)) | ||
| .all(db) | ||
| .await | ||
| .map_err(|e| PhotonApiError::UnexpectedError(format!("DB error: {}", e)))?; | ||
|
|
||
| let tree_to_queue: HashMap<Vec<u8>, Vec<u8>> = tree_metadata_list | ||
| .into_iter() | ||
| .map(|t| (t.tree_pubkey, t.queue_pubkey)) | ||
| .collect(); | ||
|
|
||
| let queues: Vec<QueueInfo> = queue_sizes | ||
| .into_iter() | ||
| .map(|((tree_bytes, queue_type), size)| { | ||
| let queue_bytes = tree_to_queue | ||
| .get(&tree_bytes) | ||
| .cloned() | ||
| .unwrap_or_else(|| vec![0u8; 32]); | ||
|
|
||
| QueueInfo { | ||
| tree: bs58::encode(&tree_bytes).into_string(), | ||
| queue: bs58::encode(&queue_bytes).into_string(), | ||
| queue_type, | ||
| queue_size: size, | ||
| } | ||
| }) | ||
| .collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle empty queue_sizes early and avoid silent fallback queue pubkeys.
Two things worth tightening up here:
-
If
queue_sizesis empty, the follow-uptree_metadata::Entity::find().filter(TreePubkey.is_in(tree_pubkeys))is unnecessary and may rely on how the ORM encodes an emptyINlist. A simple early return (Ok(GetQueueInfoResponse { queues: vec![], slot })) afterfetch_queue_sizeswould be clearer and avoid a redundant query. -
When a tree pubkey is missing from
tree_to_queue, you currently fall back tovec![0u8; 32]and return that as a base58 string. That silently turns a data inconsistency into an apparently valid queue pubkey (111...). Consider either:- Skipping such entries, or
- Surfacing this as an error, or
- At least logging a warning so missing metadata isn’t hidden.
🤖 Prompt for AI Agents
In src/api/method/get_queue_info.rs around lines 119 to 154, after calling
fetch_queue_sizes handle two issues: if queue_sizes.is_empty() return early with
Ok(GetQueueInfoResponse { queues: vec![], slot }) to avoid the unnecessary DB IN
query; and when mapping queue_sizes to QueueInfo, do not silently substitute
vec![0u8;32] for missing tree pubkeys — instead log a warning (e.g.
warn!("missing tree metadata for pubkey {}", bs58::encode(&tree_bytes))) and
skip that entry (do not include it in the queues vector) or return an error if
you prefer to fail fast; implement the chosen behavior consistently and remove
the zeroed fallback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/api/method/get_multiple_new_address_proofs.rs (1)
96-128: Consider reporting all queued addresses in the error message.The current logic fast-fails on the first queued address found but doesn't report if multiple requested addresses are already in the queue. While this prevents the operation correctly, reporting all queued addresses would provide better feedback to the caller.
You could collect all queued addresses and include them in the error message:
if !queue_results.is_empty() { - let queued_address: Vec<u8> = - queue_results[0].try_get("", "address").map_err(|e| { - PhotonApiError::UnexpectedError(format!("Failed to get address: {}", e)) - })?; - let queued_address = SerializablePubkey::try_from(queued_address)?; - return Err(PhotonApiError::ValidationError(format!( - "Address {} already exists", - queued_address - ))); + let queued_addresses: Result<Vec<SerializablePubkey>, _> = queue_results + .iter() + .map(|row| { + let addr_bytes: Vec<u8> = row.try_get("", "address")?; + SerializablePubkey::try_from(addr_bytes) + }) + .collect(); + let queued_addresses = queued_addresses.map_err(|e| { + PhotonApiError::UnexpectedError(format!("Failed to get address: {}", e)) + })?; + return Err(PhotonApiError::ValidationError(format!( + "Address(es) already exist: {:?}", + queued_addresses + ))); }src/api/method/get_queue_elements.rs (2)
189-226: Consider parallelizing independent queue fetches.Output and input queue fetches are independent and could run concurrently using
tokio::join!to reduce latency.Apply this diff:
- let (output_queue, output_proof_data) = if let Some(limit) = request.output_queue_limit { + let output_future = async { + if let Some(limit) = request.output_queue_limit { + let zkp_hint = request.output_queue_zkp_batch_size; + match fetch_queue( + &tx, + &request.tree, + QueueType::OutputStateV2, + request.output_queue_start_index, + limit, + zkp_hint, + ) + .await? + { + QueueData::Output(data, proof_data) => Ok((Some(data), proof_data)), + QueueData::Input(_, _) => unreachable!("OutputState should return Output"), + } + } else { + Ok((None, None)) + } + }; + + let input_future = async { + if let Some(limit) = request.input_queue_limit { + let zkp_hint = request.input_queue_zkp_batch_size; + match fetch_queue( + &tx, + &request.tree, + QueueType::InputStateV2, + request.input_queue_start_index, + limit, + zkp_hint, + ) + .await? + { + QueueData::Input(data, proof_data) => Ok((Some(data), proof_data)), + QueueData::Output(_, _) => unreachable!("InputState should return Input"), + } + } else { + Ok((None, None)) + } + }; + + let ((output_queue, output_proof_data), (input_queue, input_proof_data)) = + tokio::join!(output_future, input_future); + let (output_queue, output_proof_data) = output_queue?; + let (input_queue, input_proof_data) = input_future?; - let zkp_hint = request.output_queue_zkp_batch_size; - match fetch_queue( - &tx, - &request.tree, - QueueType::OutputStateV2, - request.output_queue_start_index, - limit, - zkp_hint, - ) - .await? - { - QueueData::Output(data, proof_data) => (Some(data), proof_data), - QueueData::Input(_, _) => unreachable!("OutputState should return Output"), - } - } else { - (None, None) - }; - - let (input_queue, input_proof_data) = if let Some(limit) = request.input_queue_limit { - let zkp_hint = request.input_queue_zkp_batch_size; - match fetch_queue( - &tx, - &request.tree, - QueueType::InputStateV2, - request.input_queue_start_index, - limit, - zkp_hint, - ) - .await? - { - QueueData::Input(data, proof_data) => (Some(data), proof_data), - QueueData::Output(_, _) => unreachable!("InputState should return Input"), - } - } else { - (None, None) - };
779-828: Consider logging hash computation failures.On lines 821-823, when
Poseidon::hashvfails, the code silently breaks without logging or propagating the error. This could make debugging difficult if hash computation fails unexpectedly.Apply this diff:
match parent_hash { Ok(hash) => { current_hash = Hash::from(hash); let parent_pos = pos / 2; let parent_idx = encode_node_index((level + 1) as u8, parent_pos, tree_info.height as u8); nodes_map.insert(parent_idx, current_hash.clone()); } Err(_) => { + log::warn!( + "Failed to compute parent hash at level {} for proof at index {}", + level, + proof.lowElementLeafIndex + ); break; } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
src/api/api.rs(3 hunks)src/api/method/get_multiple_new_address_proofs.rs(1 hunks)src/api/method/get_queue_elements.rs(2 hunks)src/ingester/persist/leaf_node.rs(2 hunks)src/main.rs(1 hunks)src/openapi/mod.rs(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/main.rs
🧰 Additional context used
🧬 Code graph analysis (4)
src/openapi/mod.rs (1)
src/api/method/get_queue_elements.rs (1)
get_queue_elements(169-268)
src/ingester/persist/leaf_node.rs (1)
tests/integration_tests/mock_tests.rs (1)
map(1507-1514)
src/api/api.rs (1)
src/api/method/get_queue_info.rs (1)
get_queue_info(101-159)
src/api/method/get_queue_elements.rs (3)
src/api/method/get_multiple_new_address_proofs.rs (2)
get_multiple_new_address_proofs(179-195)get_multiple_new_address_proofs_helper(52-174)src/ingester/persist/persisted_state_tree.rs (1)
get_subtrees(132-209)src/monitor/queue_hash_cache.rs (2)
get_cached_hash_chains(69-102)store_hash_chains_batch(14-66)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run tests
🔇 Additional comments (17)
src/api/method/get_multiple_new_address_proofs.rs (2)
81-81: LGTM! Idiomatic Rust improvement.Using
.or_default()is more idiomatic than.or_insert_with(Vec::new)and functionally equivalent for types implementingDefault.
18-18: Previous review concern appears addressed.The
MAX_ADDRESSESconstant is now 1000, down from the 30_000 value flagged in the previous review. This resolves the concerns about excessive memory usage, large IN clauses, and query planner stress mentioned in earlier feedback.src/ingester/persist/leaf_node.rs (2)
172-190: LGTM! The implementation safely addresses the past review concern.The new implementation correctly handles
ActiveValueextraction using pattern matching instead of.unwrap(), eliminating the panic risk noted in the previous review. The approach is actually more defensive than the suggested fix, as it explicitly handles allActiveValuevariants.The sequence extraction logic is sound:
- Pattern matching on
&m.seqwithSet(opt)safely extracts the innerOption<i64>filter_mapautomatically filters outNonevalues, producingVec<i64>- Min/max computation handles the empty case gracefully (both would be
None)The debug logging provides useful observability for tracking persistence operations, including update counts and sequence ranges.
210-215: LGTM! Post-persistence logging enhances observability.The debug log after successful persistence complements the pre-persistence log, providing confirmation that the operation completed successfully and helping trace the full lifecycle of tree node persistence.
src/api/method/get_queue_elements.rs (11)
270-321: LGTM!The proof merging logic correctly combines output and input queue proofs, deduplicates nodes, and handles the case where no proofs are available.
323-408: LGTM!The validation logic is comprehensive:
- Enforces maximum queue element limits
- Properly filters by queue type
- Handles empty results gracefully
- Includes a pruning detection check to alert users when requested data has been pruned
471-482: Excellent fix for the previous unwrap() issue!The code now properly handles
Hash::newerrors without panicking, including helpful error messages with context (index and leaf_index). This addresses the concern raised in previous reviews.
426-443: Verify intentional return of empty data when batch size requirements aren't met.When
zkp_batch_size > 0and fewer thanzkp_batch_sizeelements are available, the function returns empty data (lines 432-437). Ensure this is the intended behavior and that callers are aware that partial batches are discarded.If this is expected, consider documenting this behavior in the function's doc comment.
492-517: LGTM! Good cache-with-fallback pattern.The code gracefully degrades to local computation when cache data is unavailable, with appropriate logging. This ensures the API remains functional even if the monitor hasn't populated the cache yet.
532-570: LGTM! Excellent error handling.The extraction of
tx_hashesandnullifiersincludes comprehensive error handling with detailed context (index, leaf_index) to aid debugging.
589-663: LGTM!The hash chain computation logic is well-structured with:
- Early validation for edge cases
- Clear separation of output vs. input queue logic
- Proper error handling with descriptive messages
- Debug logging for observability
665-729: LGTM!The validation, tree info retrieval, and initial data fetching are properly structured with comprehensive error handling.
843-942: LGTM!The address queue hash chain caching follows the same robust cache-with-fallback pattern used for state queues, with appropriate validation and error handling.
965-1006: LGTM!The node deduplication logic efficiently combines proof nodes with pre-fetched path nodes from the database, avoiding redundant hash computations.
1017-1072: LGTM!The path node fetching logic efficiently builds a deduplicated set of node indices and retrieves them in a single database query.
src/openapi/mod.rs (1)
23-26: LGTM! OpenAPI schema properly reflects API changes.The imports and component schema declarations correctly expose the new queue data structures (
AddressQueueData,InputQueueData,OutputQueueData,StateQueueData) andQueueInfo, replacing the removed legacy types.Also applies to: 87-91
src/api/api.rs (1)
80-82: LGTM! Queue info endpoint properly integrated.The
get_queue_infoendpoint is correctly wired through the API layer with appropriate imports, method implementation, and OpenAPI registration, following the established patterns in this file.Also applies to: 276-281, 397-401
| fn encode_node_index(level: u8, position: u64, tree_height: u8) -> u64 { | ||
| debug_assert!( | ||
| level <= tree_height, | ||
| "level {} > tree_height {}", | ||
| level, | ||
| tree_height | ||
| ); | ||
| ((level as u64) << 56) | position | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider replacing debug_assert! with runtime validation.
The debug_assert! is stripped in release builds, so invalid inputs (level > tree_height) could silently corrupt node indices in production.
Apply this diff to add runtime validation:
#[inline]
fn encode_node_index(level: u8, position: u64, tree_height: u8) -> u64 {
- debug_assert!(
- level <= tree_height,
- "level {} > tree_height {}",
- level,
- tree_height
- );
+ assert!(
+ level <= tree_height,
+ "level {} > tree_height {}",
+ level,
+ tree_height
+ );
((level as u64) << 56) | position
}Alternatively, return a Result and propagate the error:
fn encode_node_index(level: u8, position: u64, tree_height: u8) -> Result<u64, PhotonApiError> {
if level > tree_height {
return Err(PhotonApiError::UnexpectedError(
format!("level {} > tree_height {}", level, tree_height)
));
}
Ok(((level as u64) << 56) | position)
}🤖 Prompt for AI Agents
In src/api/method/get_queue_elements.rs around lines 34 to 42, replace the
debug-only assertion with real runtime validation by changing the function to
return Result<u64, PhotonApiError>; check if level > tree_height and return
Err(PhotonApiError::UnexpectedError(format!("level {} > tree_height {}", level,
tree_height))) when invalid, otherwise return Ok(((level as u64) << 56) |
position), and update all callers to propagate the Result (use ? or map_err as
appropriate).
| fn leaf_index_to_node_index(leaf_index: u32, tree_height: u32) -> i64 { | ||
| 2_i64.pow(tree_height - 1) + leaf_index as i64 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add overflow protection for tree height.
If tree_height >= 32, the expression 2_i64.pow(tree_height - 1) will overflow i64, causing a panic in debug mode or wrap-around in release mode.
Apply this diff to add validation:
#[inline]
fn leaf_index_to_node_index(leaf_index: u32, tree_height: u32) -> i64 {
+ assert!(
+ tree_height < 32,
+ "tree_height {} would overflow i64",
+ tree_height
+ );
2_i64.pow(tree_height - 1) + leaf_index as i64
}Or return a Result to propagate the error gracefully.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/api/method/get_queue_elements.rs around lines 46-48, the function
computing node index uses 2_i64.pow(tree_height - 1) which can overflow for
large tree_height; change the function to validate tree_height (e.g., ensure
tree_height > 0 and below a safe limit such as 63 or the project-specific max —
or per reviewer suggestion, < 32 if that's the intended constraint) or make the
function return a Result and use checked arithmetic (checked_shl/checked_pow or
compute in i128 and verify it fits in i64) so you return an Err on overflow
instead of panicking/wrapping. Ensure the caller is updated to handle the Result
if you choose that approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
src/api/method/get_queue_elements.rs (2)
34-42: Past review issue not addressed: Replace debug_assert! with runtime validation.The
debug_assert!is stripped in release builds, so invalid inputs (level > tree_height) could silently corrupt node indices in production.Based on past review comments, this should use runtime validation. Consider returning a
Resultor using a regularassert!:#[inline] fn encode_node_index(level: u8, position: u64, tree_height: u8) -> u64 { - debug_assert!( + assert!( level <= tree_height, "level {} > tree_height {}", level, tree_height ); ((level as u64) << 56) | position }
46-48: Past review issue not addressed: Add overflow protection for tree height.If
tree_height >= 32, the expression2_i64.pow(tree_height - 1)will overflow i64, causing a panic in debug mode or wrap-around in release mode.Based on past review comments, add validation:
#[inline] fn leaf_index_to_node_index(leaf_index: u32, tree_height: u32) -> i64 { + assert!( + tree_height > 0 && tree_height < 32, + "tree_height {} would overflow i64", + tree_height + ); 2_i64.pow(tree_height - 1) + leaf_index as i64 }
🧹 Nitpick comments (1)
src/api/method/get_queue_elements.rs (1)
670-967: Consider refactoring fetch_address_queue_v2 for maintainability.This function is ~300 lines long and handles multiple concerns: DB queries, proof generation, node deduplication, hash chain computation, and caching. This complexity makes it difficult to test and maintain.
Consider extracting smaller focused functions:
query_address_queue_from_db- lines 694-728generate_address_non_inclusion_proofs- lines 764-833compute_and_cache_address_hash_chains- lines 851-950This would improve testability and readability.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
src/api/method/get_queue_elements.rs(2 hunks)src/openapi/mod.rs(2 hunks)tests/integration_tests/batched_address_tree_tests.rs(3 hunks)tests/integration_tests/batched_state_tree_tests.rs(6 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
tests/integration_tests/batched_state_tree_tests.rs (1)
src/api/method/get_queue_elements.rs (1)
get_queue_elements(178-274)
tests/integration_tests/batched_address_tree_tests.rs (1)
src/api/method/get_queue_elements.rs (1)
get_queue_elements(178-274)
src/api/method/get_queue_elements.rs (6)
src/common/mod.rs (1)
format_bytes(157-164)src/ingester/persist/leaf_node_proof.rs (2)
get_multiple_compressed_leaf_proofs_by_indices(14-90)indices(42-42)src/ingester/persist/persisted_state_tree.rs (2)
get_subtrees(132-209)bytes(571-574)src/ingester/persist/leaf_node.rs (4)
leaf_index_to_node_index(30-32)from(35-42)from(46-53)from(57-64)src/ingester/parser/tree_info.rs (2)
height(67-73)get(18-26)src/monitor/queue_hash_cache.rs (1)
get_cached_hash_chains(69-102)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run tests
🔇 Additional comments (9)
src/openapi/mod.rs (1)
23-26: LGTM! OpenAPI schema updates align with new per-queue API.The addition of new queue-related types (InputQueueData, OutputQueueData, AddressQueueData, StateQueueData, Node, QueueRequest, QueueInfo) to both imports and OpenAPI schema components is consistent with the broader API refactoring to support per-queue requests.
Also applies to: 87-93
tests/integration_tests/batched_address_tree_tests.rs (2)
166-177: LGTM! Test correctly migrated to new per-queue API.The test now properly constructs
GetQueueElementsRequestwith per-queue fields (output_queue, input_queue, address_queue) instead of the previous single-queue approach. The use ofQueueRequestfor per-queue parameters is consistent with the new API design.Also applies to: 223-234
179-202: LGTM! Proper handling of new address queue structure.The test correctly accesses the address queue data from the response and uses the new structure (
address_queue_before.addresseswithelement.0.to_bytes()) for iterating and comparing addresses.src/api/method/get_queue_elements.rs (3)
476-487: LGTM! Proper error handling for Hash::new.The previous
unwrap()issue has been properly addressed. The code now usesmap_errwith descriptive error messages and collects intoResult<Vec<Hash>, PhotonApiError>, which is then propagated with?.
178-274: LGTM! Well-structured per-queue request handling.The main function properly:
- Validates that at least one queue is requested (lines 186-190)
- Fetches output and input queues conditionally with proper error handling
- Merges proof data for state queues
- Handles address queue independently
- Returns a well-structured response
328-592: Verify the zkp_batch_size truncation logic aligns with client expectations.Lines 434-448 truncate the results to full batches when
zkp_batch_sizeis provided. If the returned data is truncated (e.g., 15 elements requested with batch_size=10 returns only 10), clients may not realize they received fewer elements than requested.Consider:
- Documenting this behavior clearly in the API specification
- Adding a field to the response indicating truncation occurred
- Or returning a validation error if the limit is not a multiple of zkp_batch_size
Does the client expect truncation or should this be an error? Please clarify the intended behavior.
tests/integration_tests/batched_state_tree_tests.rs (3)
240-263: LGTM! Test correctly uses new per-queue API structure.The test properly constructs
GetQueueElementsRequestwith per-queue fields (output_queue, input_queue, address_queue) and sets unused queues toNone. The use ofQueueRequestwith explicitlimit,start_index, andzkp_batch_sizeparameters is consistent with the new API design.Also applies to: 277-300
306-326: LGTM! Safe navigation of nested queue structures.The test correctly accesses queue data using safe Option chaining (
.as_ref(),.and_then(),.map_or()), properly handling the nested structure where queues are now wrapped instate_queue. The default value of 0 for missing queues is appropriate for length calculations.Also applies to: 337-356, 359-379
394-410: LGTM! Proper handling of output queue leaf indices.The test correctly extracts the output queue from
state_queue, accessesleaf_indicesandleaves, and iterates through them for Merkle tree updates. The slice length check (min(10)) prevents index out of bounds.
| low_element_proofs.push(proof.proof.clone()); | ||
|
|
||
| let leaf_idx = | ||
| encode_node_index(0, proof.lowElementLeafIndex as u64, tree_info.height as u8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential data loss from u8 cast of tree height.
Lines 799, 809, and 823 (and similarly at lines 982, 985, 993, 1002) cast tree_info.height to u8. If the tree height exceeds 255, this will silently truncate the value, leading to incorrect node index encoding.
Apply validation before calling encode_node_index, or change the function signature to accept u32 and validate internally:
+ if tree_info.height > u8::MAX as u32 {
+ return Err(PhotonApiError::ValidationError(format!(
+ "Tree height {} exceeds maximum supported height of {}",
+ tree_info.height, u8::MAX
+ )));
+ }
+ let tree_height_u8 = tree_info.height as u8;
let leaf_idx =
- encode_node_index(0, proof.lowElementLeafIndex as u64, tree_info.height as u8);
+ encode_node_index(0, proof.lowElementLeafIndex as u64, tree_height_u8);Or refactor encode_node_index to accept u32 and validate internally.
Also applies to: 809-809, 823-823
🤖 Prompt for AI Agents
In src/api/method/get_queue_elements.rs around lines 799, 809, 823 (and
similarly 982, 985, 993, 1002), tree_info.height is being cast to u8 before
calling encode_node_index which can silently truncate values >255; either
validate tree_info.height fits in u8 before the cast and return/error if it
doesn’t, or preferably change encode_node_index to accept a wider integer (u32)
and perform an internal bounds check there (returning an error if the height is
out of allowed range) and then call it with the uncasted value; ensure all call
sites are updated to the new signature or include the pre-checks to prevent
silent truncation.
| Poseidon::hashv(&[¤t_hash.0, &sibling_hash.0]) | ||
| } else { | ||
| Poseidon::hashv(&[&sibling_hash.0, ¤t_hash.0]) | ||
| }; | ||
|
|
||
| match parent_hash { | ||
| Ok(hash) => { | ||
| current_hash = Hash::from(hash); | ||
| let parent_pos = pos / 2; | ||
| let parent_idx = | ||
| encode_node_index((level + 1) as u8, parent_pos, tree_info.height as u8); | ||
| nodes_map.insert(parent_idx, current_hash.clone()); | ||
| } | ||
| Err(_) => { | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent error suppression in hash computation loop.
Line 827 uses break when Poseidon::hashv returns an error, silently stopping the proof path computation without propagating the error. This could lead to incomplete node maps and incorrect proofs.
Consider propagating the error instead:
match parent_hash {
Ok(hash) => {
current_hash = Hash::from(hash);
let parent_pos = pos / 2;
let parent_idx =
encode_node_index((level + 1) as u8, parent_pos, tree_info.height as u8);
nodes_map.insert(parent_idx, current_hash.clone());
}
- Err(_) => {
- break;
+ Err(e) => {
+ return Err(PhotonApiError::UnexpectedError(format!(
+ "Failed to compute parent hash at level {}: {}", level, e
+ )));
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Poseidon::hashv(&[¤t_hash.0, &sibling_hash.0]) | |
| } else { | |
| Poseidon::hashv(&[&sibling_hash.0, ¤t_hash.0]) | |
| }; | |
| match parent_hash { | |
| Ok(hash) => { | |
| current_hash = Hash::from(hash); | |
| let parent_pos = pos / 2; | |
| let parent_idx = | |
| encode_node_index((level + 1) as u8, parent_pos, tree_info.height as u8); | |
| nodes_map.insert(parent_idx, current_hash.clone()); | |
| } | |
| Err(_) => { | |
| break; | |
| } | |
| } | |
| Poseidon::hashv(&[¤t_hash.0, &sibling_hash.0]) | |
| } else { | |
| Poseidon::hashv(&[&sibling_hash.0, ¤t_hash.0]) | |
| }; | |
| match parent_hash { | |
| Ok(hash) => { | |
| current_hash = Hash::from(hash); | |
| let parent_pos = pos / 2; | |
| let parent_idx = | |
| encode_node_index((level + 1) as u8, parent_pos, tree_info.height as u8); | |
| nodes_map.insert(parent_idx, current_hash.clone()); | |
| } | |
| Err(e) => { | |
| return Err(PhotonApiError::UnexpectedError(format!( | |
| "Failed to compute parent hash at level {}: {}", level, e | |
| ))); | |
| } | |
| } |
🤖 Prompt for AI Agents
In src/api/method/get_queue_elements.rs around lines 813 to 829, the code
silently stops the hash loop by using `break` when `Poseidon::hashv` returns an
error, which can produce incomplete node maps and bad proofs; replace the
`break` with proper error propagation (e.g., return Err(...) or use the `?`
operator) so the calling function receives the failure, and if needed add a
descriptive context message when converting the Poseidon error into your
function's error type; ensure the function signature returns a compatible Result
and update callers if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
src/api/method/get_queue_elements.rs (3)
48-52: Protectleaf_index_to_node_indexfrom overflow/underflow and keep in sync with other implementation.
leaf_index_to_node_indexuses2_i64.pow(tree_height - 1)with no bounds checks. For largetree_heightthis will overflowi64, and fortree_height == 0it will underflow the exponent (tree_height - 1), causing panics/wrap-around. The same logic exists insrc/ingester/persist/leaf_node.rs, so divergence here could be confusing.Consider adding an explicit range check to keep this safe and make the invariant obvious (and mirror it in the other file):
#[inline] -fn leaf_index_to_node_index(leaf_index: u32, tree_height: u32) -> i64 { - 2_i64.pow(tree_height - 1) + leaf_index as i64 -} +fn leaf_index_to_node_index(leaf_index: u32, tree_height: u32) -> i64 { + // Support realistic Merkle heights while avoiding i64 overflow and underflow. + assert!( + (1..32).contains(&tree_height), + "tree_height {} out of supported range [1, 31]", + tree_height + ); + 2_i64.pow(tree_height - 1) + leaf_index as i64 +}You may also want to update
src/ingester/persist/leaf_node.rsto use the same invariant or centralize this helper.
468-476: Avoid silent truncation when castingtree_info.heighttou8.
tree_info.heightis cast tou8in several places (e.g.,tree_info.height as u8forStateQueueProofData.tree_heightandencode_node_indexcalls). If a tree height ever exceeds 255, this will silently truncate, corrupting node indices and proofs. This was called out previously and is still present.A small helper that validates the range once and reuses the checked
u8eliminates the risk:@@ const MAX_QUEUE_ELEMENTS_SQLITE: u16 = 15; +#[inline] +fn tree_height_to_u8(height: u32) -> Result<u8, PhotonApiError> { + if height > u8::MAX as u32 { + return Err(PhotonApiError::ValidationError(format!( + "Tree height {} exceeds maximum supported height of {}", + height, + u8::MAX + ))); + } + Ok(height as u8) +} + @@ - let tree_height_u32 = tree_info.height as u32 + 1; + let raw_tree_height = tree_info.height as u32; + let tree_height_u32 = raw_tree_height + 1; @@ - let path_nodes = - fetch_path_nodes_from_db(tx, &serializable_tree, &indices, tree_height_u32).await?; + let path_nodes = + fetch_path_nodes_from_db(tx, &serializable_tree, &indices, tree_height_u32).await?; @@ - let proof_data = Some(StateQueueProofData { - proofs: generated_proofs.clone(), - tree_height: tree_info.height as u8, - path_nodes, - }); + let proof_data = Some(StateQueueProofData { + proofs: generated_proofs.clone(), + tree_height: tree_height_to_u8(raw_tree_height)?, + path_nodes, + }); @@ - let tree_info = TreeInfo::get(tx, &serializable_tree.to_string()) + let tree_info = TreeInfo::get(tx, &serializable_tree.to_string()) .await? .ok_or_else(|| PhotonApiError::UnexpectedError("Failed to get tree info".to_string()))?; @@ - let leaf_idx = - encode_node_index(0, proof.lowElementLeafIndex as u64, tree_info.height as u8); + let tree_height_u8 = tree_height_to_u8(tree_info.height as u32)?; + let leaf_idx = encode_node_index(0, proof.lowElementLeafIndex as u64, tree_height_u8); @@ - let sibling_idx = encode_node_index(level as u8, sibling_pos, tree_info.height as u8); + let sibling_idx = encode_node_index(level as u8, sibling_pos, tree_height_u8); @@ - let parent_idx = - encode_node_index((level + 1) as u8, parent_pos, tree_info.height as u8); + let parent_idx = + encode_node_index((level + 1) as u8, parent_pos, tree_height_u8);You might instead compute
tree_height_u8once right after readingtree_infoinfetch_address_queue_v2and reuse it throughout that function.Also applies to: 810-836
818-844: PropagatePoseidon::hashvfailures instead of silently breaking out of the loop.On error from
Poseidon::hashv, the current code justbreaks the loop, which produces an incompletenodes_mapand silently degrades proof correctness without any error to the caller.It’s safer to surface this as an API error:
- for (level, sibling_hash) in proof.proof.iter().enumerate() { + for (level, sibling_hash) in proof.proof.iter().enumerate() { @@ - let parent_hash = if pos % 2 == 0 { - Poseidon::hashv(&[¤t_hash.0, &sibling_hash.0]) - } else { - Poseidon::hashv(&[&sibling_hash.0, ¤t_hash.0]) - }; - - match parent_hash { - Ok(hash) => { - current_hash = Hash::from(hash); - let parent_pos = pos / 2; - let parent_idx = - encode_node_index((level + 1) as u8, parent_pos, tree_info.height as u8); - nodes_map.insert(parent_idx, current_hash.clone()); - } - Err(_) => { - break; - } - } + let parent_hash = if pos % 2 == 0 { + Poseidon::hashv(&[¤t_hash.0, &sibling_hash.0]) + } else { + Poseidon::hashv(&[&sibling_hash.0, ¤t_hash.0]) + } + .map_err(|e| { + PhotonApiError::UnexpectedError(format!( + "Failed to compute parent hash at level {} for lowElementLeafIndex {}: {}", + level, proof.lowElementLeafIndex, e + )) + })?; + + current_hash = Hash::from(parent_hash); + let parent_pos = pos / 2; + let parent_idx = + encode_node_index((level + 1) as u8, parent_pos, tree_height_u8); + nodes_map.insert(parent_idx, current_hash.clone()); @@ - pos /= 2; + pos /= 2;(Assuming you introduce
tree_height_u8as suggested in the previous comment and compute it once before this loop.)This way any hashing failure produces a clear
PhotonApiErrorinstead of an undetected partial proof.
🧹 Nitpick comments (5)
src/snapshot/mod.rs (1)
261-265: Unnecessaryclone()on line 262.The
pathvariable is not used after this assignment, so.clone()is unnecessary when the prefix is empty.let full_path = if self.gcs_prefix.is_empty() { - path.clone() + path } else { format!("{}/{}", self.gcs_prefix, path) };src/monitor/queue_hash_cache.rs (1)
88-98: Consider logging when hash chain length is invalid.The current implementation silently skips entries where
result.hash_chain.len() != 32. This could mask data corruption issues. A debug/warn log would help with troubleshooting.for result in results { if result.hash_chain.len() == 32 { let mut hash_array = [0u8; 32]; hash_array.copy_from_slice(&result.hash_chain); chains.push(CachedHashChain { zkp_batch_index: result.zkp_batch_index, hash_chain: hash_array, }); + } else { + debug!( + "Skipping cached hash chain with invalid length: {} (expected 32)", + result.hash_chain.len() + ); } }src/monitor/queue_monitor.rs (1)
260-261: Consider usingHashSetforused_cached_indices.The
used_cached_indicesvector is checked withcontains()at line 313, which is O(n). For typical batch sizes this is acceptable, but aHashSetwould provide O(1) lookup.src/api/method/get_queue_elements.rs (2)
332-452: Consider SQLite parameter limits when combiningMAX_QUEUE_ELEMENTSwithfetch_path_nodes_from_db.
fetch_queueallows up toMAX_QUEUE_ELEMENTS(30,000) for all backends. For each leaf index,fetch_path_nodes_from_dbbuilds the full path of node indices and feeds them intostate_trees::Column::NodeIdx.is_in(all_path_indices). On SQLite this can easily exceed the 999-parameter limit even for modest tree heights, causing runtime SQL errors.Given you already special-case SQLite for
MAX_QUEUE_ELEMENTS_SQLITEin the address queue, you may want to either:
- Use a lower per-request limit for state queues on SQLite, or
- Chunk
all_path_indicesinto multiple smallerINqueries and merge the results.No functional change is required immediately, but this is a potential source of backend-specific failures under larger loads.
280-330: Optionally validate consistency between output and input state proofs inmerge_state_queue_proofs.
merge_state_queue_proofstrusts that all proofs share the same tree height, root, and root sequence. If, due to misconfiguration or data corruption, output and input queues reference different roots/heights, you could merge incompatible paths without noticing.If you want stronger guarantees, consider:
- Checking that any second
tree_heightmatches the first and returning an error if not.- Verifying that all proofs agree on
rootandroot_seq(or at least asserting this invariant in debug).This would make it easier to detect cross-tree mixups early.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
Cargo.toml(2 hunks)src/api/method/get_queue_elements.rs(2 hunks)src/monitor/queue_hash_cache.rs(4 hunks)src/monitor/queue_monitor.rs(6 hunks)src/snapshot/gcs_utils/mod.rs(1 hunks)src/snapshot/gcs_utils/resumable_upload.rs(1 hunks)src/snapshot/mod.rs(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- Cargo.toml
🧰 Additional context used
🧬 Code graph analysis (4)
src/snapshot/gcs_utils/mod.rs (1)
src/snapshot/gcs_utils/resumable_upload.rs (1)
resumable_upload(17-38)
src/monitor/queue_monitor.rs (1)
src/monitor/queue_hash_cache.rs (2)
delete_hash_chains(104-130)store_hash_chains_batch(14-66)
src/snapshot/mod.rs (1)
src/snapshot/gcs_utils/resumable_upload.rs (2)
resumable_upload(17-38)get_access_token(321-335)
src/api/method/get_queue_elements.rs (6)
src/common/mod.rs (1)
format_bytes(157-164)src/ingester/persist/leaf_node_proof.rs (2)
get_multiple_compressed_leaf_proofs_by_indices(14-90)indices(42-42)src/ingester/persist/persisted_state_tree.rs (2)
get_subtrees(132-209)bytes(571-574)src/ingester/persist/leaf_node.rs (4)
leaf_index_to_node_index(30-32)from(35-42)from(46-53)from(57-64)src/ingester/parser/tree_info.rs (2)
height(67-73)get(18-26)src/monitor/queue_hash_cache.rs (2)
get_cached_hash_chains(69-102)store_hash_chains_batch(14-66)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run tests
🔇 Additional comments (21)
src/snapshot/gcs_utils/mod.rs (1)
1-1: LGTM!Simple module declaration correctly exposes the
resumable_uploadsubmodule.src/snapshot/mod.rs (2)
32-32: LGTM!The new
gcs_utilsmodule is correctly declared and exposed.
267-279: LGTM!The resumable upload integration is well-structured with proper error context for both token acquisition and upload operations.
src/snapshot/gcs_utils/resumable_upload.rs (8)
1-14: LGTM!Imports are appropriate, and constants are well-documented. The 8MB chunk size aligns with GCS recommendations for resumable uploads.
17-38: LGTM!Clean orchestration of the resumable upload flow with appropriate logging.
41-109: LGTM!Solid retry logic with exponential backoff for server errors and rate limits. The Location header extraction is correctly implemented.
160-275: LGTM with minor observation.The chunked upload logic with retry and resume-from-offset is well-implemented. The
continue 'outerpattern correctly handles resume scenarios by recalculating chunks from the new position.
280-318: LGTM!The upload status query correctly parses the
Rangeheader to determine resume position, with appropriate fallback behavior when the header is missing or the upload is already complete.
320-335: LGTM!Good fallback pattern: tries GCE metadata service first (for cloud environments), then falls back to service account credentials file for local development.
337-355: LGTM!Correct implementation with appropriate timeout for non-GCP environments and proper
Metadata-Flavorheader.
357-424: LGTM!Well-implemented JWT-based service account authentication flow. The
unwrap()onduration_since(UNIX_EPOCH)(line 385) is safe since system time is always after the Unix epoch.src/monitor/queue_hash_cache.rs (2)
14-23: Good use of generics for connection abstraction.Making the function generic over
ConnectionTraitimproves flexibility and testability, allowing callers to pass transactions or mock connections.
104-130: LGTM!The deletion function correctly filters by all columns of the composite key and returns the affected row count for observability. The early return for empty indices is efficient.
src/monitor/queue_monitor.rs (4)
299-310: Verifyzkp_batch_indexvalue in divergence struct.The
zkp_batch_indexfield is set tozkp_batch_idx(the loop-relative index starting from 0), whileactual_zkp_idxis used elsewhere for cache operations. When logged vialog_divergence, this will show the relative offset rather than the actual batch index.If the intent is to show the actual ZKP batch index for debugging, consider using
actual_zkp_idx:divergences.push(HashChainDivergence { queue_info: QueueHashChainInfo { queue_type, tree_pubkey, batch_index: pending_batch_index, current_index: start_offset + (zkp_batch_idx as u64 * zkp_batch_size), }, expected_hash_chain: *computed, actual_hash_chain: *on_chain, - zkp_batch_index: zkp_batch_idx, + zkp_batch_index: actual_zkp_idx, });
327-346: Good cache invalidation strategy.The two-phase validation approach ensures only verified chains are cached, and invalid cached entries are cleaned up. Logging errors without propagating is appropriate since cache operations shouldn't block the main verification flow.
404-413: LGTM - appropriate log level adjustment.Incomplete batches are expected during normal operation, so debug level is appropriate to reduce noise while maintaining observability when needed.
237-245: Good documentation of the index offset discrepancy.The comment clearly explains the 1-based vs 0-based indexing difference between on-chain batch.start_index and address_queues.queue_index. Using
saturating_sub(1)is defensive against underflow. The referenced explanation in address.rs lines 51-55 confirms this pattern: address queues are 0-based while batch updates use 1-based indices because address trees have a pre-initialized zeroth element.src/api/method/get_queue_elements.rs (4)
479-492: Nice improvement on hash decoding and error messages.The new
Hash::new(...).map_err(...)logic foraccount_hashes,tx_hashes, andnullifiersremoves the previousunwrap()panics and adds precise error context (index and leaf_index). This is a solid robustness improvement for corrupted DB rows.No changes requested.
Also applies to: 541-579
598-672:compute_state_queue_hash_chainsbatch semantics and validation look sound.The function cleanly bails out on empty input or incomplete batches, validates hash/nullifier length, and differentiates queue types correctly. Together with the earlier truncation logic in
fetch_queue, this keeps hash-chain computation aligned to full ZK batches.No further changes needed here.
674-979: Overall address-queue flow is solid; main concerns are already covered above.The address queue logic (limit checks, raw SQL construction via
format_bytes, proof count validation, node deduplication, and hash-chain caching/persistence) is coherent and defensive. Apart from the tree-height casting andPoseidon::hashverror handling already called out, the rest of the function looks good.No additional changes requested.
182-278: Top-levelget_queue_elementsorchestration looks correct and composable.Input validation (at least one queue required), transaction scoping, per-queue fetching, and the merged
StateQueueData/AddressQueueDataresponse structure are all consistent and easy to extend. Error propagation via?is clean.No issues from this layer beyond the lower-level concerns already noted.
| // First, we need to collect all data to know total size | ||
| // For very large files, we could use unknown size (*) but that's more complex | ||
| let mut all_data = Vec::new(); | ||
| while let Some(chunk_result) = byte_stream.next().await { | ||
| let chunk = chunk_result?; | ||
| all_data.extend_from_slice(&chunk); | ||
| } | ||
|
|
||
| let total_size = all_data.len() as u64; | ||
| info!( | ||
| "Total upload size: {} bytes ({:.2} MB)", | ||
| total_size, | ||
| total_size as f64 / 1024.0 / 1024.0 | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Collecting entire stream into memory defeats streaming benefits for large files.
The resumable upload is designed for large files, but collecting all data into memory first (line 125-128) could cause OOM for multi-GB uploads. Consider implementing true streaming with unknown total size (* in Content-Range) or chunked streaming with size tracking.
For true streaming without knowing total size upfront, GCS supports:
Content-Range: bytes 0-CHUNK_END/*
for intermediate chunks, and:
Content-Range: bytes OFFSET-LAST/*
for the final chunk (where the response confirms completion).
Alternatively, if the total size can be determined without buffering (e.g., from file metadata), pass it as a parameter.
🤖 Prompt for AI Agents
In src/snapshot/gcs_utils/resumable_upload.rs around lines 123-136, the current
code buffers the entire byte stream into a Vec (lines 125-128) which can OOM for
multi-GB uploads; instead implement true streaming: do not collect all_data —
read chunks from byte_stream and upload them incrementally using Resumable
Upload semantics (send intermediate PUTs with Content-Range: bytes
OFFSET-CHUNK_END/* and for the final chunk send Content-Range with the final
total or confirm completion based on the GCS response), or if total size is
available from metadata pass it and use full Content-Range; ensure you track the
current offset, mark the final chunk appropriately, handle partial-resume
responses and retries, and never hold the whole payload in memory.
| if total_size == 0 { | ||
| // Handle empty file case | ||
| let response = client | ||
| .put(upload_uri) | ||
| .header(CONTENT_LENGTH, "0") | ||
| .header(CONTENT_RANGE, "bytes */*") | ||
| .send() | ||
| .await | ||
| .context("Failed to upload empty file")?; | ||
|
|
||
| if !response.status().is_success() { | ||
| let status = response.status(); | ||
| let body = response.text().await.unwrap_or_default(); | ||
| return Err(anyhow!( | ||
| "Failed to upload empty file: {} - {}", | ||
| status, | ||
| body | ||
| )); | ||
| } | ||
| return Ok(()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Content-Range header for empty files should specify size.
For empty files where the total size is known to be 0, the Content-Range header should be bytes */0 rather than bytes */*. The */* format indicates unknown total size.
let response = client
.put(upload_uri)
.header(CONTENT_LENGTH, "0")
- .header(CONTENT_RANGE, "bytes */*")
+ .header(CONTENT_RANGE, "bytes */0")
.send()
.await
.context("Failed to upload empty file")?;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if total_size == 0 { | |
| // Handle empty file case | |
| let response = client | |
| .put(upload_uri) | |
| .header(CONTENT_LENGTH, "0") | |
| .header(CONTENT_RANGE, "bytes */*") | |
| .send() | |
| .await | |
| .context("Failed to upload empty file")?; | |
| if !response.status().is_success() { | |
| let status = response.status(); | |
| let body = response.text().await.unwrap_or_default(); | |
| return Err(anyhow!( | |
| "Failed to upload empty file: {} - {}", | |
| status, | |
| body | |
| )); | |
| } | |
| return Ok(()); | |
| } | |
| if total_size == 0 { | |
| // Handle empty file case | |
| let response = client | |
| .put(upload_uri) | |
| .header(CONTENT_LENGTH, "0") | |
| .header(CONTENT_RANGE, "bytes */0") | |
| .send() | |
| .await | |
| .context("Failed to upload empty file")?; | |
| if !response.status().is_success() { | |
| let status = response.status(); | |
| let body = response.text().await.unwrap_or_default(); | |
| return Err(anyhow!( | |
| "Failed to upload empty file: {} - {}", | |
| status, | |
| body | |
| )); | |
| } | |
| return Ok(()); | |
| } |
🤖 Prompt for AI Agents
In src/snapshot/gcs_utils/resumable_upload.rs around lines 138 to 158, the code
handling uploads for empty files sets the Content-Range header to "bytes */*"
which denotes unknown total size; change it to use the known total size "bytes
*/0" (or format as bytes */{total_size}) so the header explicitly specifies the
zero length, i.e. construct the Content-Range value from total_size (0) and send
"bytes */0" instead of "bytes */*".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
src/snapshot/gcs_utils/resumable_upload.rs (2)
123-136: Collecting entire stream into memory defeats streaming benefits for large files.This issue was previously flagged and remains unaddressed. The resumable upload is designed for large files, but collecting all data into memory first (lines 125-128) could cause OOM for multi-GB uploads.
138-158: Content-Range header for empty files should specify size.This issue was previously flagged and remains unaddressed. For empty files where the total size is known to be 0, the Content-Range header should be
bytes */0rather thanbytes */*.
🧹 Nitpick comments (3)
src/snapshot/gcs_utils/resumable_upload.rs (3)
177-183: Avoid unnecessary allocation.The
.to_vec()call creates an unnecessary copy of the chunk data. Sincereqwest::Bodycan be created from a slice, you can avoid this allocation.Apply this diff:
let response = client .put(upload_uri) .header(CONTENT_LENGTH, chunk_data.len().to_string()) .header(CONTENT_RANGE, &content_range) .header(CONTENT_TYPE, "application/octet-stream") - .body(chunk_data.to_vec()) + .body(Bytes::copy_from_slice(chunk_data)) .send() .await;
266-274: Progress logging might skip the final chunk.The condition
offset % (100 * 1024 * 1024) < CHUNK_SIZE as u64might not log progress for the final chunk if it doesn't align with the 100MB boundary. Consider adding a final progress log after the loop completes.Apply this diff after line 275:
} } + // Log final progress + if total_size >= 100 * 1024 * 1024 { + info!( + "Upload progress: 100.0% ({:.2} MB / {:.2} MB)", + total_size as f64 / 1024.0 / 1024.0, + total_size as f64 / 1024.0 / 1024.0 + ); + } + Ok(()) }
359-426: LGTM! Correct OAuth2 service account flow implementation.The JWT creation and token exchange follow GCP's service account authentication flow correctly. The RS256 algorithm and claim structure are appropriate.
Optional: Consider making the OAuth scope configurable.
The scope is currently hardcoded to
devstorage.read_write(line 392). If this module might be used for operations requiring different scopes in the future, consider accepting it as a parameter or constant.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/snapshot/gcs_utils/resumable_upload.rs(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run tests
🔇 Additional comments (6)
src/snapshot/gcs_utils/resumable_upload.rs (6)
1-13: LGTM! Well-chosen constants for GCS resumable uploads.The 8MB chunk size aligns with GCS recommendations (multiples of 256KB), and the retry configuration is appropriate.
15-38: LGTM! Clean orchestration of the resumable upload workflow.The function properly delegates to specialized helpers and provides clear logging at each stage.
40-109: LGTM! Robust retry logic with exponential backoff.The function correctly handles transient failures (5xx, 429) and extracts the upload URI from the Location header.
280-318: LGTM! Correct implementation of upload status query.The function properly queries GCS for the current upload position and correctly parses the Range header to determine the resume offset.
320-337: LGTM! Appropriate fallback strategy for authentication.The function correctly prioritizes service account credentials over the metadata service, providing flexibility for different deployment environments.
339-357: LGTM! Standard GCE metadata service token retrieval.The implementation correctly queries the metadata service with appropriate headers and timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (5)
src/api/method/get_queue_elements.rs (5)
35-43: [Duplicate] Debug assertion insufficient for production.This debug_assert was previously flagged and remains unaddressed. In release builds, invalid
level > tree_heightwill silently corrupt node indices.
47-49: [Duplicate] Overflow protection still missing.Previously flagged: if
tree_height >= 32, the expression2_i64.pow(tree_height - 1)will overflow, causing a panic or wrap-around.
807-842: [Duplicate] Silent error suppression remains.This was previously flagged: the
breakat line 836 silently stops proof path computation whenPoseidon::hashvfails, potentially creating incomplete node maps and incorrect proofs. Error should be propagated instead.
808-808: [Duplicate] Tree height truncation risk persists.Previously flagged: casting
tree_info.heighttou8at lines 808, 818, and 832 can silently truncate values exceeding 255, leading to incorrect node index encoding.Also applies to: 818-818, 832-832
994-994: [Duplicate] Tree height u8 casts continue.Previously flagged: lines 994, 1002, and 1011 cast
tree_height(u8) when callingencode_node_index. Whiletree_heightis already u8 in this function's parameters, the upstream issue remains if callers pass truncated values.Also applies to: 1002-1002, 1011-1011
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/api/method/get_queue_elements.rs(2 hunks)
🔇 Additional comments (10)
src/api/method/get_queue_elements.rs (10)
179-275: Well-structured orchestration.The function clearly orchestrates per-queue fetching, merging, and transaction handling. The validation ensuring at least one queue is requested is appropriate.
277-327: LGTM.The proof merging logic correctly collects and deduplicates nodes from both output and input queues.
329-593: Well-implemented queue fetching with proper error handling.The function correctly handles different queue types, validates limits, and properly propagates errors. The Hash::new error handling at lines 477-488 now correctly collects Results instead of using unwrap(), addressing the previous review concern.
595-669: LGTM.The hash chain computation logic correctly handles both OutputStateV2 and InputStateV2 queue types with proper error handling and fallback behavior.
671-806: Address queue logic is well-structured.The function properly validates limits based on database backend, fetches queue elements, and retrieves non-inclusion proofs. The raw SQL queries use
format_bytesfor safe parameter handling.
843-976: Complex but correct hash chain caching.The caching strategy with fallback computation is appropriate. The logic correctly handles batch boundaries and stores computed chains for future requests.
981-1024: Node deduplication logic is sound.The function correctly walks proof paths, collects sibling and parent hashes from both proofs and the database, deduplicates by index, and returns a sorted result.
1026-1031: LGTM.The indexed leaf hash computation is straightforward and properly propagates errors.
1069-1090: Database query and result mapping look correct.The SeaORM query properly filters by tree and node indices, and the result mapping handles hash conversion errors appropriately.
1-30: Well-organized imports and constants.The module structure is clear with appropriate external dependencies. The differentiated limits for SQLite vs. other databases (lines 28-29) reflect platform-specific constraints.
| let all_path_indices: Vec<i64> = leaf_indices | ||
| .iter() | ||
| .flat_map(|&leaf_idx| { | ||
| let node_idx = leaf_index_to_node_index(leaf_idx as u32, tree_height); | ||
| let mut path = vec![node_idx]; | ||
| let mut current = node_idx; | ||
| while current > 1 { | ||
| current >>= 1; | ||
| path.push(current); | ||
| } | ||
| path | ||
| }) | ||
| .sorted() | ||
| .dedup() | ||
| .collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: u64 to u32 truncation in leaf index.
Line 1052 casts leaf_idx (u64) to u32 when calling leaf_index_to_node_index. This silently truncates leaf indices exceeding u32::MAX (~4.3 billion), corrupting the node index calculation and path fetching.
Apply this diff to validate the cast:
let all_path_indices: Vec<i64> = leaf_indices
.iter()
.flat_map(|&leaf_idx| {
+ if leaf_idx > u32::MAX as u64 {
+ log::error!("Leaf index {} exceeds u32::MAX", leaf_idx);
+ return vec![];
+ }
let node_idx = leaf_index_to_node_index(leaf_idx as u32, tree_height);Or better, refactor leaf_index_to_node_index to accept u64 and validate internally, then update all call sites.
🤖 Prompt for AI Agents
In src/api/method/get_queue_elements.rs around lines 1049–1063 the code unsafely
casts leaf_idx (u64) to u32 before calling leaf_index_to_node_index which
truncates values > u32::MAX; change to avoid the cast by refactoring
leaf_index_to_node_index to accept u64 (or otherwise operate on u64) and perform
internal validation (or explicit try_into with an error return) so oversized
indices are handled rather than silently truncated, then update all call sites
to pass the u64 and handle the validation error where appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (4)
src/api/method/get_queue_elements.rs (4)
31-49: Harden node index helpers against invalid heights and overflow
encode_node_indexrelies ondebug_assert!, which is stripped in release builds, andleaf_index_to_node_indexuses2_i64.pow(tree_height - 1)with no bounds check. Iftree_heightis 0 or large (≥64), this can underflow/overflow and either panic (debug) or silently wrap in release. Because these helpers feed DB indices and proof construction, this is worth tightening.Consider at minimum enforcing a safe range at runtime:
#[inline] fn encode_node_index(level: u8, position: u64, tree_height: u8) -> u64 { - debug_assert!( - level <= tree_height, - "level {} > tree_height {}", - level, - tree_height - ); + assert!( + level <= tree_height, + "level {} > tree_height {}", + level, + tree_height + ); ((level as u64) << 56) | position } /// Convert leaf_index to node_index in binary tree format (root=1, children of N are 2N and 2N+1) #[inline] fn leaf_index_to_node_index(leaf_index: u32, tree_height: u32) -> i64 { - 2_i64.pow(tree_height - 1) + leaf_index as i64 + assert!( + tree_height > 0 && tree_height <= 63, + "tree_height {} out of supported range (1..=63)", + tree_height + ); + 2_i64 + .pow(tree_height - 1) + .checked_add(leaf_index as i64) + .expect("leaf_index_to_node_index overflow") }Or, if you prefer not to panic, make
leaf_index_to_node_indexreturn aResult<i64, PhotonApiError>and propagate errors from callers.
479-596: Good handling of proofs and hash‑chain caching; watch tree height cast tou8The proof handling and hash‑chain cache fallback are well done:
- You validate
generated_proofs.len() == indices.len().StateQueueProofDataholds both proofs and DB path nodes, enablingdeduplicate_nodes_from_refs.- Hash‑chain caching uses
expected_batch_countand falls back to local computation, truncating cached chains to the expected count.One thing to re‑check:
tree_heightis stored astree_info.height as u8inStateQueueProofDataand later passed intoencode_node_index. Iftree_info.heightcan ever exceedu8::MAX, this will silently truncate and corrupt indices. You may want to either:
- Enforce a max height at ingest / TreeInfo creation, or
- Validate here (returning a
PhotonApiError::ValidationErroriftree_info.height > u8::MAX as u32).
674-990: Address queue v2: fixzkp_batch_size == 0and hash‑chain cache alignmentThis block is generally well‑structured (per‑backend
max_allowed, raw SQL withformat_bytes, subtrees prefetch, and reuse of the hash‑chain cache), but there are a few correctness edge cases:
zkp_batch_size == 0semantics are inconsistent
- In
get_queue_elements,zkp_batch_sizefor the address queue is taken as‑is from the request orDEFAULT_ADDRESS_ZKP_BATCH_SIZEifNone.- Unlike state queues, a value of
0is not normalized and only errors in one code path (if zkp_batch_size == 0inside the recompute branch).- If there is any cached data, or if
addresses.is_empty(), azkp_batch_sizeof0will not error, but will still influenceexpected_batch_count(which becomes0), leading to odd behavior.It would be clearer to either reject
zkp_batch_size == 0up front or normalize it to the default, e.g.:let zkp_batch_size = req
- .zkp_batch_size
- .unwrap_or(DEFAULT_ADDRESS_ZKP_BATCH_SIZE as u16);
- .zkp_batch_size
- .filter(|v| *v > 0)
- .unwrap_or(DEFAULT_ADDRESS_ZKP_BATCH_SIZE as u16);
and keep the `zkp_batch_size == 0` error path as a hard guard inside `fetch_address_queue_v2`. 2. **Cached hash‑chains may not align with the current request** In the cached branch: ```rust if !cached.is_empty() && cached.len() >= expected_batch_count { let mut sorted = cached; sorted.sort_by_key(|c| c.zkp_batch_index); for entry in sorted { leaves_hash_chains.push(Hash::from(entry.hash_chain)); } }
expected_batch_countcan be0(e.g.,addresses.len() < zkp_batch_size), yet the condition still passes and you return all cached chains.- Even when
expected_batch_count > 0, you return every cached chain rather than just the firstexpected_batch_count, whereas the state queue path uses.take(expected_batch_count).This can cause
leaves_hash_chains.len()to exceed the number of full batches implied byaddresses.len()for this response.Making this symmetric with the state queue logic would avoid over‑returning:
- if !cached.is_empty() && cached.len() >= expected_batch_count { + if expected_batch_count > 0 && !cached.is_empty() && cached.len() >= expected_batch_count { log::debug!( ... ); let mut sorted = cached; sorted.sort_by_key(|c| c.zkp_batch_index); - for entry in sorted { - leaves_hash_chains.push(Hash::from(entry.hash_chain)); - } + for entry in sorted.into_iter().take(expected_batch_count) { + leaves_hash_chains.push(Hash::from(entry.hash_chain)); + } } else if !addresses.is_empty() {
Merkle index helpers and Poseidon errors
- As in the state path,
tree_info.heightis cast tou8in calls toencode_node_index(e.g., lines 821–822, 832, 846). If heights can exceed 255, this silently truncates. Same remediation as noted fortree_heightabove: validate or bound heights before casting.- In the Poseidon loop,
Err(_) => break;silently suppresses parent‑hash failures and yields partialnodes_map. Propagating an error instead would prevent subtle proof corruption:
match parent_hash {Ok(hash) => { ... }Err(_) => {break;}}
match parent_hash {Ok(hash) => { ... }Err(e) => {return Err(PhotonApiError::UnexpectedError(format!("Failed to compute parent hash at level {}: {}",level, e)));}}
1040-1104: Fix potentialu64→u32truncation when building path indicesIn
fetch_path_nodes_from_db, the closure:.flat_map(|&leaf_idx| { let node_idx = leaf_index_to_node_index(leaf_idx as u32, tree_height); ... })blindly casts
leaf_idx: u64tou32. If any leaf index exceedsu32::MAX, this truncates and you’ll compute and query the wrong path indices. Prior review already flagged this as critical; it’s still present.A safer pattern is to validate and avoid the
flat_map/iterator trick, so you can return aPhotonApiErroron overflow:- let all_path_indices: Vec<i64> = leaf_indices - .iter() - .flat_map(|&leaf_idx| { - let node_idx = leaf_index_to_node_index(leaf_idx as u32, tree_height); - let mut path = vec![node_idx]; - let mut current = node_idx; - while current > 1 { - current >>= 1; - path.push(current); - } - path - }) - .sorted() - .dedup() - .collect(); + let mut all_path_indices: Vec<i64> = Vec::new(); + for &leaf_idx in leaf_indices { + let leaf_u32: u32 = leaf_idx.try_into().map_err(|_| { + PhotonApiError::ValidationError(format!( + "Leaf index {} exceeds u32::MAX and cannot be converted", + leaf_idx + )) + })?; + let mut node_idx = leaf_index_to_node_index(leaf_u32, tree_height); + all_path_indices.push(node_idx); + while node_idx > 1 { + node_idx >>= 1; + all_path_indices.push(node_idx); + } + } + all_path_indices.sort(); + all_path_indices.dedup();This removes the silent truncation and preserves the existing semantics.
🧹 Nitpick comments (5)
src/ingester/persist/indexed_merkle_tree/proof.rs (1)
259-359: Document duplicate address handling and explain BATCH_SIZE.Two observations:
Duplicate addresses: If the input
valuescontains duplicates, the HashMap will only retain one entry per unique address (the last one processed). This behavior should be documented in the function's doc comment.BATCH_SIZE constant: The same unexplained magic number of 100 appears here. Consider documenting why this value was chosen or making it a module-level constant with a comment explaining the choice.
Add documentation:
/// Optimized version for API use: Query the next smallest element for each input address. /// Returns a HashMap mapping INPUT ADDRESS -> range node model. /// This is O(1) lookup per address instead of O(n) scan in the caller. +/// Note: If the input contains duplicate addresses, only one entry per unique address will be returned. pub async fn query_next_smallest_elements_by_address<T>(Consider extracting the constant:
/// Batch size for chunked queries to avoid query plan explosion while maintaining efficiency const QUERY_BATCH_SIZE: usize = 100;src/api/method/get_queue_elements.rs (4)
63-99: Request/response schemas look good; clarifyDefaultsemanticsThe new
QueueRequest,GetQueueElementsRequest, andGetQueueElementsResponseshapes look coherent and future‑proof (per‑queue options,deny_unknown_fields,camelCase).One nuance:
GetQueueElementsRequestderivesDefault, which yields an all‑zerotreeand no queues, and will always hit the"At least one queue must be requested"validation. IfDefaultis only for serde this is harmless, but if it might be used programmatically, consider either:
- Dropping
DefaultforGetQueueElementsRequest, or- Providing a manual
Defaultthat is clearly invalid only in tests (e.g.,tree: Hash::default()but documented as such).
101-171: Data structures are well‑factored; consider documenting invariants
Node,StateQueueData,OutputQueueData,InputQueueData, andAddressQueueDataneatly separate shared nodes from per‑queue payloads, and the serde configuration matches the API shape.Given how tightly
leaf_indices,account_hashes,leaves, andleaves_hash_chainsmust align by position for verification, it would help future maintainers/clients if you document the index invariants (e.g., “all vectors here are aligned by index”) in struct‑level doc comments.
332-477:fetch_queuelogic is solid; nice defensive hash handlingThe per‑queue query construction and validation look good:
- Limits are enforced (
limit > MAX_QUEUE_ELEMENTS).- Input queue filters (
NullifierQueueIndex.is_not_null,NullifiedInTree = false,Spent = true) are clear.- Start index + pruning check (
first_queue_index > start_index) is explicit.Also, the rework around
Hash::newforaccount_hashesis much safer than the previousunwrap, with precise error messages includingidxandleaf_index.No issues here; just one minor thought:
zkp_batch_sizeis always >0 given the.filter(|v| *v > 0).unwrap_or(DEFAULT_ZKP_BATCH_SIZE)logic, so theif zkp_batch_size > 0guard is redundant and could be removed for clarity, but that’s purely cosmetic.
598-672:compute_state_queue_hash_chainsbehavior is correct; small cleanups possibleThe batch computation logic is consistent with how you truncate queues:
- Early‑returns on empty input or when
queue_elements.len() / zkp_batch_size == 0.- Enforces 32‑byte hash/nullifier lengths with clear error messages.
- Uses
create_hash_chain_from_sliceand wraps any failure intoPhotonApiError.Two minor nits:
- You re‑
use create_hash_chain_from_sliceinside the function even though it’s already imported at the top of the file.- Since this helper assumes
zkp_batch_size > 0andqueue_elements.len()is a multiple of that (as enforced byfetch_queue), adding an internaldebug_assert!(zkp_batch_size > 0)would make those assumptions explicit.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/api/method/get_queue_elements.rs(2 hunks)src/ingester/persist/indexed_merkle_tree/proof.rs(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/ingester/persist/indexed_merkle_tree/proof.rs (2)
src/common/mod.rs (1)
format_bytes(157-164)tests/integration_tests/utils.rs (1)
format(269-270)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run tests
🔇 Additional comments (4)
src/ingester/persist/indexed_merkle_tree/proof.rs (2)
182-184: LGTM! Good optimization.Early return for empty input avoids unnecessary database operations.
374-390: LGTM! Efficient optimization.The refactor correctly uses the new
query_next_smallest_elements_by_addressfunction to achieve O(1) address lookup instead of scanning through results. The comments clearly explain the performance benefit.src/api/method/get_queue_elements.rs (2)
182-278:get_queue_elementsorchestration is clear; transaction handling looks soundThe top‑level orchestration reads clean: you validate that at least one queue is requested, open a transaction, fetch per‑queue data, merge state proofs, then commit. Error propagation (especially around DB and hash‑chain cache access) is consistent, and
skip_serializing_ifon optional queues keeps the payloads lean.No functional concerns here from my side.
1040-1045:compute_indexed_leaf_hashis clean and well‑wrappedThe indexed‑leaf hashing helper is nicely encapsulated:
- Uses
Poseidon::hashvdirectly on the inner[u8; 32]data.- Wraps any Poseidon error into
PhotonApiError::UnexpectedErrorwith context.- Returns a
Hashnewtype, keeping the rest of the code clean.No changes needed here.
| fn merge_state_queue_proofs( | ||
| output_proof_data: &Option<StateQueueProofData>, | ||
| input_proof_data: &Option<StateQueueProofData>, | ||
| ) -> Result<(Vec<Node>, Hash, u64), PhotonApiError> { | ||
| let mut all_proofs: Vec<&crate::ingester::persist::MerkleProofWithContext> = Vec::new(); | ||
| let mut all_path_nodes: HashMap<i64, Hash> = HashMap::new(); | ||
| let mut tree_height: Option<u8> = None; | ||
| let mut initial_root: Option<Hash> = None; | ||
| let mut root_seq: Option<u64> = None; | ||
|
|
||
| // Collect proofs and path nodes from output queue | ||
| if let Some(ref proof_data) = output_proof_data { | ||
| tree_height = Some(proof_data.tree_height); | ||
| all_path_nodes.extend(proof_data.path_nodes.clone()); | ||
| for proof in &proof_data.proofs { | ||
| if initial_root.is_none() { | ||
| initial_root = Some(proof.root.clone()); | ||
| root_seq = Some(proof.root_seq); | ||
| } | ||
| all_proofs.push(proof); | ||
| } | ||
| } | ||
|
|
||
| // Collect proofs and path nodes from input queue | ||
| if let Some(ref proof_data) = input_proof_data { | ||
| if tree_height.is_none() { | ||
| tree_height = Some(proof_data.tree_height); | ||
| } | ||
| all_path_nodes.extend(proof_data.path_nodes.clone()); | ||
| for proof in &proof_data.proofs { | ||
| if initial_root.is_none() { | ||
| initial_root = Some(proof.root.clone()); | ||
| root_seq = Some(proof.root_seq); | ||
| } | ||
| all_proofs.push(proof); | ||
| } | ||
| } | ||
|
|
||
| if all_proofs.is_empty() || tree_height.is_none() { | ||
| return Ok((Vec::new(), Hash::default(), 0)); | ||
| } | ||
|
|
||
| let height = tree_height.unwrap(); | ||
| let nodes = deduplicate_nodes_from_refs(&all_proofs, height, &all_path_nodes); | ||
|
|
||
| Ok(( | ||
| nodes, | ||
| initial_root.unwrap_or_default(), | ||
| root_seq.unwrap_or_default(), | ||
| )) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the file size and read the function in context
wc -l src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 105
🏁 Script executed:
# Read the relevant section of the file
cat -n src/api/method/get_queue_elements.rs | sed -n '270,340p'Repository: Lightprotocol/photon
Length of output: 2793
🏁 Script executed:
# Search for StateQueueProofData definition
rg -A 10 'struct StateQueueProofData|pub struct StateQueueProofData' --type rsRepository: Lightprotocol/photon
Length of output: 90
🏁 Script executed:
# Find callers of merge_state_queue_proofs
rg 'merge_state_queue_proofs' --type rs -B 2 -A 2Repository: Lightprotocol/photon
Length of output: 90
🏁 Script executed:
# Check the PhotonApiError type definition
rg -A 20 'enum PhotonApiError|pub enum PhotonApiError' --type rsRepository: Lightprotocol/photon
Length of output: 90
🏁 Script executed:
# Search for StateQueueProofData definition without type filter
rg 'struct StateQueueProofData|pub struct StateQueueProofData' -A 10Repository: Lightprotocol/photon
Length of output: 824
🏁 Script executed:
# Find callers of merge_state_queue_proofs
rg 'merge_state_queue_proofs'Repository: Lightprotocol/photon
Length of output: 244
🏁 Script executed:
# Check PhotonApiError definition
rg 'enum PhotonApiError|pub enum PhotonApiError' -A 30Repository: Lightprotocol/photon
Length of output: 1656
🏁 Script executed:
# Search for how this function is used in context
rg -B 5 -A 5 'merge_state_queue_proofs\('Repository: Lightprotocol/photon
Length of output: 1598
🏁 Script executed:
# Search for MerkleProofWithContext structure and its root field
rg 'struct MerkleProofWithContext|pub struct MerkleProofWithContext' -A 15Repository: Lightprotocol/photon
Length of output: 1320
🏁 Script executed:
# Search for deduplicate_nodes_from_refs to see its assumptions
rg 'fn deduplicate_nodes_from_refs|pub fn deduplicate_nodes_from_refs' -A 20Repository: Lightprotocol/photon
Length of output: 1742
🏁 Script executed:
# Look for any documentation or comments about tree_height invariants
rg 'tree_height|root_seq' -B 2 -A 2 | head -100Repository: Lightprotocol/photon
Length of output: 6489
🏁 Script executed:
# Check if there are any comments or documentation about output/input queue expectations
rg 'output.*input|input.*output' -B 3 -A 3 src/api/method/get_queue_elements.rs | head -80Repository: Lightprotocol/photon
Length of output: 1545
🏁 Script executed:
# Look at how output_proof_data and input_proof_data are created to understand their source
rg 'output_proof_data|input_proof_data' -B 5 src/api/method/get_queue_elements.rs | head -100Repository: Lightprotocol/photon
Length of output: 1257
Add explicit consistency checks for tree_height, root, and root_seq in merge_state_queue_proofs
The function's state queue data documentation explicitly states that initial_root is "shared by output and input queues," yet the current implementation silently uses first-seen values when merging output and input proof data. If these independently-fetched queues represent different state versions (mismatched tree_height, root, or root_seq), the result would be an incoherent StateQueueData.
Since deduplicate_nodes_from_refs uses a single tree_height for all proofs, mismatched heights could produce incorrect node calculations. Add explicit validation to fail fast:
if let Some(ref proof_data) = input_proof_data {
- if tree_height.is_none() {
- tree_height = Some(proof_data.tree_height);
- }
+ if let Some(h) = tree_height {
+ if h != proof_data.tree_height {
+ return Err(PhotonApiError::UnexpectedError(format!(
+ "Mismatched tree heights between output and input proofs: {} vs {}",
+ h, proof_data.tree_height
+ )));
+ }
+ } else {
+ tree_height = Some(proof_data.tree_height);
+ }
...
for proof in &proof_data.proofs {
- if initial_root.is_none() {
- initial_root = Some(proof.root.clone());
- root_seq = Some(proof.root_seq);
- }
+ if let Some(ref root) = initial_root {
+ if *root != proof.root {
+ return Err(PhotonApiError::UnexpectedError(
+ "Mismatched initial_root across state queue proofs".to_string(),
+ ));
+ }
+ } else {
+ initial_root = Some(proof.root.clone());
+ root_seq = Some(proof.root_seq);
+ }
all_proofs.push(proof);
}
}🤖 Prompt for AI Agents
In src/api/method/get_queue_elements.rs around lines 280 to 330, the
merge_state_queue_proofs function currently picks the first-seen tree_height,
initial_root and root_seq from output/input without validating they match; add
explicit consistency checks so that when both output_proof_data and
input_proof_data are Some you compare their tree_height values and their
roots/root_seq (or compare each incoming proof's root/root_seq against the
chosen initial_root) and return an Err(PhotonApiError::... /* choose an
appropriate error variant with a clear message */) if any mismatch is found;
ensure the function only proceeds to call deduplicate_nodes_from_refs when a
single consistent tree_height is established and both queues agree on
initial_root and root_seq, otherwise bail fast with the error.
| /// Deduplicate nodes across all merkle proofs using pre-fetched path nodes from DB. | ||
| /// Returns a Vec<Node> sorted by index. | ||
| /// Uses path_nodes (DB node_idx -> hash) for parent hashes instead of computing them. | ||
| fn deduplicate_nodes_from_refs( | ||
| proofs: &[&crate::ingester::persist::MerkleProofWithContext], | ||
| tree_height: u8, | ||
| path_nodes: &HashMap<i64, Hash>, | ||
| ) -> Vec<Node> { | ||
| let mut nodes_map: HashMap<u64, Hash> = HashMap::new(); | ||
| let tree_height_u32 = tree_height as u32 + 1; | ||
|
|
||
| for proof_ctx in proofs { | ||
| let mut pos = proof_ctx.leaf_index as u64; | ||
| let mut db_node_idx = leaf_index_to_node_index(proof_ctx.leaf_index, tree_height_u32); | ||
|
|
||
| // Store the leaf itself | ||
| let leaf_idx = encode_node_index(0, pos, tree_height); | ||
| nodes_map.insert(leaf_idx, proof_ctx.hash.clone()); | ||
|
|
||
| // Walk up the proof path, storing sibling hashes and path node hashes from DB | ||
| for (level, sibling_hash) in proof_ctx.proof.iter().enumerate() { | ||
| let sibling_pos = if pos % 2 == 0 { pos + 1 } else { pos - 1 }; | ||
|
|
||
| // Store the sibling (from proof) | ||
| let sibling_idx = encode_node_index(level as u8, sibling_pos, tree_height); | ||
| nodes_map.insert(sibling_idx, sibling_hash.clone()); | ||
|
|
||
| // Move to parent | ||
| db_node_idx >>= 1; | ||
| pos /= 2; | ||
|
|
||
| // Store the parent hash from DB (if available) | ||
| if let Some(parent_hash) = path_nodes.get(&db_node_idx) { | ||
| let parent_idx = encode_node_index((level + 1) as u8, pos, tree_height); | ||
| nodes_map.insert(parent_idx, parent_hash.clone()); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| let mut sorted_nodes: Vec<(u64, Hash)> = nodes_map.into_iter().collect(); | ||
| sorted_nodes.sort_by_key(|(idx, _)| *idx); | ||
|
|
||
| sorted_nodes | ||
| .into_iter() | ||
| .map(|(index, hash)| Node { index, hash }) | ||
| .collect() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
find . -name "get_queue_elements.rs" -type fRepository: Lightprotocol/photon
Length of output: 102
🏁 Script executed:
git ls-files | grep -i "get_queue_elements"Repository: Lightprotocol/photon
Length of output: 100
🏁 Script executed:
git ls-files | grep "src/api" | head -20Repository: Lightprotocol/photon
Length of output: 1070
🏁 Script executed:
git ls-files | head -30Repository: Lightprotocol/photon
Length of output: 1146
🏁 Script executed:
find . -name "*.rs" -path "*/api/*" -type f 2>/dev/null | head -20Repository: Lightprotocol/photon
Length of output: 1150
🏁 Script executed:
find . -name "*.rs" -type f 2>/dev/null | grep -E "(api|queue)" | head -20Repository: Lightprotocol/photon
Length of output: 1150
🏁 Script executed:
ls -la . && find . -maxdepth 2 -type dRepository: Lightprotocol/photon
Length of output: 1254
🏁 Script executed:
rg "deduplicate_nodes_from_refs" 2>/dev/nullRepository: Lightprotocol/photon
Length of output: 252
🏁 Script executed:
rg "leaf_index_to_node_index" 2>/dev/nullRepository: Lightprotocol/photon
Length of output: 1726
🏁 Script executed:
find . -type f -name "*.rs" 2>/dev/null | head -10Repository: Lightprotocol/photon
Length of output: 576
🏁 Script executed:
test -f "src/api/method/get_queue_elements.rs" && echo "File exists" || echo "File does not exist"Repository: Lightprotocol/photon
Length of output: 75
🏁 Script executed:
pwd && ls -la && git statusRepository: Lightprotocol/photon
Length of output: 1076
🏁 Script executed:
git rev-parse --show-toplevelRepository: Lightprotocol/photon
Length of output: 82
🏁 Script executed:
wc -l src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 105
🏁 Script executed:
sed -n '992,1038p' src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 1969
🏁 Script executed:
sed -n '980,1050p' src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 2757
🏁 Script executed:
grep -n "deduplicate_nodes_from_refs" src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 186
🏁 Script executed:
sed -n '900,950p' src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 1923
🏁 Script executed:
grep -B 50 "deduplicate_nodes_from_refs" src/api/method/get_queue_elements.rs | head -80Repository: Lightprotocol/photon
Length of output: 2987
🏁 Script executed:
grep -n "fn leaf_index_to_node_index" src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 138
🏁 Script executed:
sed -n '1039,1080p' src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 1396
🏁 Script executed:
grep -n "fn encode_node_index" src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 139
🏁 Script executed:
sed -n '1041,1055p' src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 643
🏁 Script executed:
tail -150 src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 5016
🏁 Script executed:
grep -n "merge_state_queue_proofs" src/api/method/get_queue_elements.rs
</parameter>
</invoke>
</function_calls>
```shell
sed -n '850,900p' src/api/method/get_queue_elements.rs
</parameter>
</invoke>
</function_calls>
Let me get a more complete view by reading the entire file in sections:
<function_calls>
<invoke name="shell">
<parameter name="code">
cat -n src/api/method/get_queue_elements.rs | grep -A 2 -B 2 "merge_state_queue_proofs"
</parameter>
</invoke>
</function_calls>
```shell
cat -n src/api/method/get_queue_elements.rs | grep -A 2 -B 2 "merge_state_queue_proofs"
```
Let me check the error types defined in the error module:
<function_calls>
<invoke name="shell">
<parameter name="code">
cat src/api/error.rs | head -50
</parameter>
</invoke>
</function_calls>
```shell
cat src/api/error.rs
```
Let me also check the leaf_index_to_node_index function in the ingester module:
<function_calls>
<invoke name="shell">
<parameter name="code">
grep -A 10 "pub fn leaf_index_to_node_index" src/ingester/persist/leaf_node.rs
</parameter>
</invoke>
</function_calls>
```shell
grep -A 10 "pub fn leaf_index_to_node_index" src/ingester/persist/leaf_node.rs
```
Now let me get a complete picture of deduplicate_nodes_from_refs and the context of its call:
<function_calls>
<invoke name="shell">
<parameter name="code">
python3 << 'EOF'
with open('src/api/method/get_queue_elements.rs', 'r') as f:
lines = f.readlines()
# Find deduplicate_nodes_from_refs call
for i, line in enumerate(lines):
if 'deduplicate_nodes_from_refs' in line and i < 900:
print(f"Call site at line {i+1}:")
start = max(0, i - 10)
end = min(len(lines), i + 5)
for j in range(start, end):
print(f"{j+1:4d}: {lines[j]}", end='')
print("\n---\n")
break
# Find the function definition
for i, line in enumerate(lines):
if 'fn deduplicate_nodes_from_refs' in line:
print(f"Function definition at line {i+1}:")
# Print function signature and first 20 lines
for j in range(i, min(i + 50, len(lines))):
print(f"{j+1:4d}: {lines[j]}", end='')
break
EOF
</parameter>
</invoke>
</function_calls>
```shell
python3 << 'EOF'
with open('src/api/method/get_queue_elements.rs', 'r') as f:
lines = f.readlines()
# Find all occurrences
for i, line in enumerate(lines):
if 'deduplicate_nodes_from_refs' in line:
print(f"Line {i+1}: {line.strip()}")
EOF
```
Let me check what functions call deduplicate_nodes_from_refs and their return types:
<function_calls>
<invoke name="shell">
<parameter name="code">
python3 << 'EOF'
with open('src/api/method/get_queue_elements.rs', 'r') as f:
lines = f.readlines()
# Find the calling function
call_line_idx = None
for i, line in enumerate(lines):
if 'deduplicate_nodes_from_refs' in line and 'fn deduplicate_nodes_from_refs' not in line:
call_line_idx = i
break
if call_line_idx:
# Find the function definition that contains this call
for i in range(call_line_idx, -1, -1):
if lines[i].strip().startswith('fn ') or (lines[i].strip().startswith('pub ') and 'fn ' in lines[i]):
print(f"Calling function at line {i+1}:")
for j in range(i, min(i + 30, len(lines))):
print(f"{j+1:4d}: {lines[j]}", end='')
break
print("\n--- FUNCTION DEFINITION ---\n")
# Find the function definition
for i, line in enumerate(lines):
if 'fn deduplicate_nodes_from_refs' in line:
print(f"Function definition at line {i+1}:")
for j in range(i, min(i + 50, len(lines))):
print(f"{j+1:4d}: {lines[j]}", end='')
break
EOF
```
Let me check the actual line count first and then read the relevant sections properly:
<function_calls>
<invoke name="shell">
<parameter name="code">
wc -l src/api/method/get_queue_elements.rs && echo "---" && sed -n '992,1055p' src/api/method/get_queue_elements.rs | cat -n
</parameter>
</invoke>
</function_calls>
```shell
sed -n '992,1055p' src/api/method/get_queue_elements.rs | cat -n
```
Perfect! Now let me look for the function that calls deduplicate_nodes_from_refs:
<function_calls>
<invoke name="shell">
<parameter name="code">
grep -n "deduplicate_nodes_from_refs\|fn merge_state_queue_proofs" src/api/method/get_queue_elements.rs
</parameter>
</invoke>
</function_calls>
```shell
grep -n "deduplicate_nodes_from_refs\|fn merge_state_queue_proofs" src/api/method/get_queue_elements.rs
```
Now let me read the function that calls deduplicate_nodes_from_refs:
<function_calls>
<invoke name="shell">
<parameter name="code">
sed -n '870,950p' src/api/method/get_queue_elements.rs | cat -n
</parameter>
</invoke>
</function_calls>
```shell
sed -n '870,950p' src/api/method/get_queue_elements.rs | cat -n
```
Now let me check the complete merge_state_queue_proofs function:
<function_calls>
<invoke name="shell">
<parameter name="code">
sed -n '830,880p' src/api/method/get_queue_elements.rs | cat -n
</parameter>
</invoke>
</function_calls>
```shell
sed -n '800,900p' src/api/method/get_queue_elements.rs | cat -n
```
Let me check what the leaf_index_to_node_index function does in both locations:
<function_calls>
<invoke name="shell">
<parameter name="code">
sed -n '1039,1050p' src/api/method/get_queue_elements.rs | cat -n
</parameter>
</invoke>
</function_calls>
```shell
sed -n '1039,1070p' src/api/method/get_queue_elements.rs | cat -n
```
Let me also check how leaf_index_to_node_index is implemented in the ingester module to see if there are any fallibility considerations:
<function_calls>
<invoke name="shell">
<parameter name="code">
grep -A 15 "pub fn leaf_index_to_node_index" src/ingester/persist/leaf_node.rs
</parameter>
</invoke>
</function_calls>
```shell
grep -A 15 "pub fn leaf_index_to_node_index" src/ingester/persist/leaf_node.rs
```
Let me check the entire leaf_node.rs file structure to understand the leaf_index_to_node_index better:
<function_calls>
<invoke name="shell">
<parameter name="code">
cat src/ingester/persist/leaf_node.rs | grep -A 20 "pub fn leaf_index_to_node_index"
</parameter>
</invoke>
</function_calls>
```shell
cat src/ingester/persist/leaf_node.rs | head -100 | tail -50
```
Now let me verify what error types are available:
<function_calls>
<invoke name="shell">
<parameter name="code">
cat src/api/error.rs | head -100
</parameter>
</invoke>
</function_calls>
```shell
head -50 src/api/error.rs
```
Let me now look at the encode_node_index function:
<function_calls>
<invoke name="shell">
<parameter name="code">
grep -B 2 -A 10 "fn encode_node_index\|fn leaf_index_to_node_index" src/api/method/get_queue_elements.rs | tail -30
</parameter>
</invoke>
</function_calls>
```shell
grep -B 2 -A 10 "fn encode_node_index\|fn leaf_index_to_node_index" src/api/method/get_queue_elements.rs
```
Now, let me check whether leaf_index_to_node_index could fail (e.g., due to overflow):
<function_calls>
<invoke name="shell">
<parameter name="code">
python3 << 'EOF'
# Let's understand if leaf_index_to_node_index can fail with overflow
# leaf_index: u32, tree_height: u32
# Operation: (leaf_index + 2^tree_height) as i64
# This could overflow if tree_height is large enough
# Max u32 = 4,294,967,295
# Max tree_height that fits: log2(i64::MAX - u32::MAX) ≈ 32
leaf_index_max = 2**32 - 1 # 4,294,967,295
tree_height_example = 32
calculation = leaf_index_max + 2**tree_height_example
i64_max = 2**63 - 1
print(f"leaf_index max: {leaf_index_max}")
print(f"2^32: {2**32}")
print(f"Calculation with tree_height=32: {calculation}")
print(f"i64::MAX: {i64_max}")
print(f"Would overflow: {calculation > i64_max}")
# With tree_height = 31
tree_height_31 = 31
calc_31 = leaf_index_max + 2**tree_height_31
print(f"\nWith tree_height=31: {calc_31}")
print(f"Would overflow: {calc_31 > i64_max}")
EOFRepository: Lightprotocol/photon
Length of output: 150
🏁 Script executed:
sed -n '900,990p' src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 3149
🏁 Script executed:
grep -B 5 "fn merge_state_queue_proofs" src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 146
🏁 Script executed:
sed -n '820,850p' src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 1393
🏁 Script executed:
sed -n '920,990p' src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 2458
🏁 Script executed:
sed -n '1041,1055p' src/api/method/get_queue_elements.rsRepository: Lightprotocol/photon
Length of output: 643
Make deduplicate_nodes_from_refs fallible to propagate index calculation errors
deduplicate_nodes_from_refs correctly deduplicates nodes across proofs using DB-fetched path nodes, but currently ignores potential overflow in leaf_index_to_node_index and encode_node_index. The former shifts 1i64 by tree_height bits (unbounded u32), and the latter shifts by level bits; both can overflow with invalid inputs.
Change the signature to return Result<Vec<Node>, PhotonApiError> and propagate errors from leaf_index_to_node_index with ? operators. Since merge_state_queue_proofs already returns Result, error propagation integrates naturally.
🤖 Prompt for AI Agents
In src/api/method/get_queue_elements.rs around lines 992 to 1038, make
deduplicate_nodes_from_refs return Result<Vec<Node>, PhotonApiError> and
propagate index-calculation errors instead of ignoring potential overflows:
update the function signature to return Result<_, PhotonApiError>, replace
direct calls to leaf_index_to_node_index and encode_node_index with their
fallible variants (or call them and use ?), and propagate any returned errors
with the ? operator; ensure all internal uses that can fail
(leaf_index_to_node_index, encode_node_index) are handled via ? and update
callers (e.g. merge_state_queue_proofs) to accept the Result return so errors
bubble up.
| let response = match txn_or_conn.get_database_backend() { | ||
| // HACK: I am executing SQL queries one by one in a loop because I am getting a weird syntax | ||
| // error when I am using parentheses. | ||
| DatabaseBackend::Postgres => { | ||
| let sql_statements = values.iter().map(|value| { | ||
| format!( | ||
| "( SELECT * FROM indexed_trees WHERE tree = {} AND value < {} ORDER BY value DESC LIMIT 1 )", | ||
| format_bytes(tree.clone(), txn_or_conn.get_database_backend()), | ||
| format_bytes(value.clone(), txn_or_conn.get_database_backend()) | ||
| ) | ||
| }); | ||
| let full_query = sql_statements.collect::<Vec<String>>().join(" UNION ALL "); | ||
| txn_or_conn | ||
| .query_all(Statement::from_string( | ||
| txn_or_conn.get_database_backend(), | ||
| full_query, | ||
| )) | ||
| .await | ||
| .map_err(|e| { | ||
| IngesterError::DatabaseError(format!("Failed to execute indexed query: {e}")) | ||
| })? | ||
| // Batch queries in chunks to avoid query plan explosion | ||
| // Each chunk uses UNION ALL which PostgreSQL optimizes well with index scans | ||
| const BATCH_SIZE: usize = 100; | ||
| let tree_bytes = format_bytes(tree.clone(), txn_or_conn.get_database_backend()); | ||
| let mut all_results = vec![]; | ||
|
|
||
| for chunk in values.chunks(BATCH_SIZE) { | ||
| let sql_statements = chunk.iter().map(|value| { | ||
| format!( | ||
| "(SELECT * FROM indexed_trees WHERE tree = {} AND value < {} ORDER BY value DESC LIMIT 1)", | ||
| tree_bytes, | ||
| format_bytes(value.clone(), txn_or_conn.get_database_backend()) | ||
| ) | ||
| }); | ||
| let full_query = sql_statements.collect::<Vec<String>>().join(" UNION ALL "); | ||
|
|
||
| let chunk_results = txn_or_conn | ||
| .query_all(Statement::from_string( | ||
| txn_or_conn.get_database_backend(), | ||
| full_query, | ||
| )) | ||
| .await | ||
| .map_err(|e| { | ||
| IngesterError::DatabaseError(format!("Failed to execute indexed query: {e}")) | ||
| })?; | ||
| all_results.extend(chunk_results); | ||
| } | ||
| all_results | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Search for similar batching patterns, UNION ALL usage, and BATCH_SIZE references
rg -n "UNION ALL|BATCH_SIZE|= ANY" --type rust -B 3 -A 3Repository: Lightprotocol/photon
Length of output: 8481
🏁 Script executed:
# Check for comments or documentation about performance decisions in the file
rg -n "batch|performance|query plan|PostgreSQL" src/ingester/persist/indexed_merkle_tree/proof.rs -i -B 2 -A 2Repository: Lightprotocol/photon
Length of output: 1566
🏁 Script executed:
# Look for git history or comments explaining why BATCH_SIZE=100 was chosen
rg -n "100" src/ingester/persist/indexed_merkle_tree/proof.rs -B 5 -A 5Repository: Lightprotocol/photon
Length of output: 1352
🏁 Script executed:
# Search for similar query construction patterns in the codebase
rg -n "query_all|Statement::from_string" --type rust -B 2 -A 2 | head -50Repository: Lightprotocol/photon
Length of output: 3714
Document the BATCH_SIZE=100 rationale and verify UNION ALL is optimal.
The comment about avoiding "query plan explosion" alludes to a real PostgreSQL concern (query planning complexity grows with UNION size), but the specific BATCH_SIZE=100 choice lacks justification. Add a comment explaining:
- Why 100 is the threshold (query plan size limits, parameter constraints, or performance testing results)
- Why UNION ALL is preferred over alternatives like
= ANY(ARRAY[...])or VALUES clauses with JOINs
This pattern appears in at least two functions in this file (lines 190 and 282), so documenting the rationale once would clarify the design decision across the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (8)
src/api/method/get_queue_elements.rs (7)
35-43: Critical:debug_assert!stripped in release builds, allowing silent data corruption.This issue was previously flagged but remains unaddressed. The
debug_assert!at lines 36-41 is removed in release builds, so invalid inputs (level > tree_height) can silently corrupt node indices in production.Apply this diff to add runtime validation:
#[inline] fn encode_node_index(level: u8, position: u64, tree_height: u8) -> u64 { - debug_assert!( + assert!( level <= tree_height, "level {} > tree_height {}", level, tree_height ); ((level as u64) << 56) | position }
47-49: Critical: Integer overflow risk for tree_height >= 32.This issue was previously flagged but remains unaddressed. If
tree_height >= 32, the expression2_i64.pow(tree_height - 1)will overflow i64, causing a panic in debug mode or wrap-around in release mode.Apply this diff to add validation:
#[inline] fn leaf_index_to_node_index(leaf_index: u32, tree_height: u32) -> i64 { + assert!( + tree_height > 0 && tree_height < 32, + "tree_height {} would overflow i64 or is invalid", + tree_height + ); 2_i64.pow(tree_height - 1) + leaf_index as i64 }
277-327: Major: Missing consistency checks when merging output and input queue proofs.This issue was previously flagged but remains unaddressed. The function silently uses first-seen values for
tree_height,initial_root, androot_seqwithout validating that output and input queues agree. If independently-fetched queues represent different state versions, the result is an incoherentStateQueueData.Since
deduplicate_nodes_from_refsuses a singletree_heightfor all proofs, mismatched heights produce incorrect node calculations.Apply this diff to add explicit validation:
if let Some(ref proof_data) = input_proof_data { - if tree_height.is_none() { - tree_height = Some(proof_data.tree_height); - } + if let Some(h) = tree_height { + if h != proof_data.tree_height { + return Err(PhotonApiError::UnexpectedError(format!( + "Mismatched tree heights between output and input proofs: {} vs {}", + h, proof_data.tree_height + ))); + } + } else { + tree_height = Some(proof_data.tree_height); + } all_path_nodes.extend(proof_data.path_nodes.clone()); for proof in &proof_data.proofs { - if initial_root.is_none() { - initial_root = Some(proof.root.clone()); - root_seq = Some(proof.root_seq); - } + if let Some(ref root) = initial_root { + if *root != proof.root { + return Err(PhotonApiError::UnexpectedError( + "Mismatched initial_root across state queue proofs".to_string(), + )); + } + } else { + initial_root = Some(proof.root.clone()); + root_seq = Some(proof.root_seq); + } all_proofs.push(proof); } }
815-816: Critical: Potential data loss from u8 cast of tree height.This issue was previously flagged but remains unaddressed. Lines 816, 826, and 840 cast
tree_info.heighttou8. If the tree height exceeds 255, this silently truncates the value, leading to incorrect node index encoding.Apply validation before calling
encode_node_index:+ if tree_info.height > u8::MAX as u32 { + return Err(PhotonApiError::ValidationError(format!( + "Tree height {} exceeds maximum supported height of {}", + tree_info.height, u8::MAX + ))); + } + let tree_height_u8 = tree_info.height as u8; + // ... later in the code: let leaf_idx = - encode_node_index(0, proof.lowElementLeafIndex as u64, tree_info.height as u8); + encode_node_index(0, proof.lowElementLeafIndex as u64, tree_height_u8); // ... and update all other call sites similarlyAlso applies to: 826-826, 839-840
843-846: Minor: Silent error suppression in hash computation loop.This issue was previously flagged but remains unaddressed. Line 843 uses
breakwhenPoseidon::hashvreturns an error, silently stopping the proof path computation without propagating the error. This could lead to incomplete node maps and incorrect proofs.Consider propagating the error instead:
match parent_hash { Ok(hash) => { current_hash = Hash::from(hash); let parent_pos = pos / 2; let parent_idx = encode_node_index((level + 1) as u8, parent_pos, tree_info.height as u8); nodes_map.insert(parent_idx, current_hash.clone()); } - Err(_) => { - break; + Err(e) => { + return Err(PhotonApiError::UnexpectedError(format!( + "Failed to compute parent hash at level {}: {}", level, e + ))); } }
988-1031: Major:deduplicate_nodes_from_refsshould be fallible to propagate index calculation errors.This issue was previously flagged but remains unaddressed. The function calls
leaf_index_to_node_indexandencode_node_index, both of which can overflow with invalid inputs. The current signature returnsVec<Node>directly, preventing error propagation.Change the signature to return
Result<Vec<Node>, PhotonApiError>and propagate errors from index calculations.Apply this diff:
-fn deduplicate_nodes_from_refs( +fn deduplicate_nodes_from_refs( proofs: &[&crate::ingester::persist::MerkleProofWithContext], tree_height: u8, path_nodes: &HashMap<i64, Hash>, -) -> Vec<Node> { +) -> Result<Vec<Node>, PhotonApiError> { let mut nodes_map: HashMap<u64, Hash> = HashMap::new(); let tree_height_u32 = tree_height as u32 + 1; for proof_ctx in proofs { let mut pos = proof_ctx.leaf_index as u64; let mut db_node_idx = leaf_index_to_node_index(proof_ctx.leaf_index, tree_height_u32); // ... rest of function } - sorted_nodes + Ok(sorted_nodes .into_iter() .map(|(index, hash)| Node { index, hash }) - .collect() + .collect()) }Then update the caller at line 320:
- let nodes = deduplicate_nodes_from_refs(&all_proofs, height, &all_path_nodes); + let nodes = deduplicate_nodes_from_refs(&all_proofs, height, &all_path_nodes)?;
1056-1070: Critical: u64 to u32 truncation in leaf index.This issue was previously flagged but remains unaddressed. Line 1059 casts
leaf_idx(u64) to u32 when callingleaf_index_to_node_index. This silently truncates leaf indices exceeding u32::MAX (~4.3 billion), corrupting the node index calculation and path fetching.Apply this diff to validate the cast:
let all_path_indices: Vec<i64> = leaf_indices .iter() .flat_map(|&leaf_idx| { + if leaf_idx > u32::MAX as u64 { + log::error!("Leaf index {} exceeds u32::MAX", leaf_idx); + return vec![]; + } let node_idx = leaf_index_to_node_index(leaf_idx as u32, tree_height);Or better, refactor
leaf_index_to_node_indexto accept u64 and validate internally, then update all call sites.src/ingester/persist/indexed_merkle_tree/proof.rs (1)
182-217: Early-return + batched Postgres path look good; BATCH_SIZE rationale still undocumented.The empty-input guard (Lines 182–184) is a nice win: it avoids building SQL when there’s nothing to query and keeps behavior consistent with the previous “returns empty map” semantics.
The Postgres batching logic (Lines 188–217) looks correct: it reuses a preformatted
treeliteral, chunks inputs, and UNION‑ALLs per‑valueSELECT … ORDER BY value DESC LIMIT 1, then materializes models from the combined rows. Error handling is also consistent with the SQLite branch.The remaining gap from the earlier review is that
const BATCH_SIZE: usize = 100;is still a magic number, and the UNION‑ALL approach isn’t documented anywhere central. Consider:
- Adding a brief comment somewhere (or a module‑level doc) explaining why 100 was chosen and why UNION ALL is preferred here.
- Optionally hoisting
BATCH_SIZEto a sharedconstso both this function and the by‑address variant can be tuned together.
🧹 Nitpick comments (2)
src/ingester/persist/indexed_merkle_tree/proof.rs (2)
261-364: Newquery_next_smallest_elements_by_addresshelper is functionally sound; consider minor robustness tweaks.The helper correctly:
- Short‑circuits on empty input.
- Uses backend‑aware hex formatting for
tree/valueand tags each row with aninput_addresscolumn so results can be keyed by the original address.- Mirrors behavior across Postgres (
{val}::bytea as input_address) and SQLite (CAST({val} AS BLOB) as input_address), and populates aHashMap<Vec<u8>, indexed_trees::Model>from the result rows.Two small follow‑ups you might consider:
- If the caller ever passes duplicate addresses, later entries will silently overwrite earlier ones in the
HashMap. If that’s undesirable, either dedupevaluesup front or document that behavior in the doc comment.BATCH_SIZEhere duplicates the constant inquery_next_smallest_elements; if you keep both, a sharedconstat module scope would reduce the chance of them drifting.Overall, the implementation is consistent and readable.
379-388: Batched exclusion‑range v2 flow is correct; proof lookup could be made more direct.Using
query_next_smallest_elements_by_addressto buildaddress_to_range(Lines 379–388) and then iterating the originaladdressesslice (Lines 394–397) preserves stable address ordering and correctly handles:
- Addresses with a concrete range node (hashed into
leaf_nodes_with_indicesandaddress_to_model).- Addresses without a range node (later picked up by the empty‑proof path).
The final join between
address_to_modelandleaf_proofsis done by searchingleaf_proofs.iter().find(|p| p.leaf_index == model.leaf_index as u32)for each address. That’s perfectly fine at current scales (DB work dominates), but if this ever runs on large batches, you might want to prebuild aHashMap<u32, &MerkleProofWithContext>keyed byleaf_indexto avoid repeated linear scans.Functionally, the new flow looks consistent and maintains correct empty‑tree handling via
proof_for_empty_tree_with_seq.Also applies to: 394-397
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/api/method/get_queue_elements.rs(2 hunks)src/ingester/persist/indexed_merkle_tree/proof.rs(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/ingester/persist/indexed_merkle_tree/proof.rs (1)
src/common/mod.rs (1)
format_bytes(157-164)
src/api/method/get_queue_elements.rs (4)
src/common/mod.rs (1)
format_bytes(157-164)src/ingester/persist/leaf_node_proof.rs (2)
get_multiple_compressed_leaf_proofs_by_indices(14-90)indices(42-42)src/ingester/persist/persisted_state_tree.rs (2)
get_subtrees(132-209)bytes(571-574)src/ingester/persist/leaf_node.rs (4)
leaf_index_to_node_index(30-32)from(35-42)from(46-53)from(57-64)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run tests
🔇 Additional comments (3)
src/api/method/get_queue_elements.rs (3)
179-275: LGTM: Well-structured orchestration with proper validation.The main entry point correctly validates that at least one queue is requested, orchestrates per-queue fetching, merges proofs for state queues, and handles address queues separately. Error propagation is appropriate.
595-669: LGTM: Robust hash chain computation with proper error handling.The function correctly handles empty inputs, performs batch processing, extracts per-queue-type values with descriptive error messages, and propagates hash chain creation errors appropriately.
329-593: LGTM: Comprehensive fetch logic with excellent error handling.The function properly validates inputs, detects pruning scenarios, handles batch size truncation, and includes robust error handling for hash conversions. Notably, the previous unwrap() issue at lines 477-488 has been correctly addressed with proper
map_errandcollect::<Result<...>>()?patterns.The hash chain caching with fallback to local computation is well-designed.
Overview
Testing
Summary by CodeRabbit
New Features
Breaking Changes
Chores
✏️ Tip: You can customize this high-level summary in your review settings.