impl(bigquery): expose query metadata on complete query handle#5925
impl(bigquery): expose query metadata on complete query handle#5925alvarowolfx wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements caching of rows, schema, page tokens, and query metadata in CompleteQuery, resolving a pending TODO, and refactors the Schema struct. The review feedback suggests several optimizations to avoid expensive clones of potentially large response data by utilizing std::mem::take and .as_ref(), as well as simplifying the Schema::new constructor to directly store the schema instead of manually reconstructing it.
| pub(crate) fn from_get_query_results_response(q: &Query, res: GetQueryResultsResponse) -> Self { | ||
| let schema = res | ||
| .schema | ||
| .clone() | ||
| .expect("complete query should have schema"); | ||
| let schema = Arc::new(Schema::new(schema)); | ||
| let page_token = if res.page_token.is_empty() { | ||
| None | ||
| } else { | ||
| Some(res.page_token.clone()) | ||
| }; | ||
| let cached_rows = VecDeque::from(res.rows.clone()); | ||
| Self { | ||
| job_service: q.job_service.clone(), | ||
| job_ref: q.job_ref.clone(), | ||
| cached_rows, | ||
| page_token, | ||
| schema, | ||
| metadata: QueryMetadata::from(res), | ||
| } | ||
| } |
There was a problem hiding this comment.
The res parameter is passed by value, meaning we own it. Instead of cloning res.rows (which can be very large and expensive to clone), we can move it into cached_rows using std::mem::take. By converting res to metadata first, we can also avoid cloning schema and page_token multiple times.
pub(crate) fn from_get_query_results_response(q: &Query, mut res: GetQueryResultsResponse) -> Self {
let cached_rows = VecDeque::from(std::mem::take(&mut res.rows));
let metadata = QueryMetadata::from(res);
let schema = metadata
.schema
.clone()
.expect("complete query should have schema");
let schema = Arc::new(Schema::new(schema));
let page_token = if metadata.page_token.is_empty() {
None
} else {
Some(metadata.page_token.clone())
};
Self {
job_service: q.job_service.clone(),
job_ref: q.job_ref.clone(),
cached_rows,
page_token,
schema,
metadata,
}
}References
- Scrutinize expensive uses of clone(). Is it necessary to copy the data? Can we move the data instead? (link)
| pub(crate) fn from_query_response(q: &Query, res: &QueryResponse) -> Self { | ||
| let schema = res | ||
| .schema | ||
| .clone() | ||
| .expect("complete query should have schema"); | ||
| let schema = Arc::new(Schema::new(schema)); | ||
| let page_token = if res.page_token.is_empty() { | ||
| None | ||
| } else { | ||
| Some(res.page_token.clone()) | ||
| }; | ||
| let cached_rows = VecDeque::from(res.rows.clone()); | ||
| Self { | ||
| job_service: q.job_service.clone(), | ||
| job_ref: q.job_ref.clone(), | ||
| cached_rows, | ||
| page_token, | ||
| schema, | ||
| metadata: QueryMetadata::from(res.clone()), | ||
| } | ||
| } |
There was a problem hiding this comment.
Since res is a reference, we must clone it. However, by cloning res once up front, we can move res.rows into cached_rows using std::mem::take and move the rest of res into metadata. This avoids cloning the potentially large rows vector twice.
pub(crate) fn from_query_response(q: &Query, res: &QueryResponse) -> Self {
let mut res = res.clone();
let cached_rows = VecDeque::from(std::mem::take(&mut res.rows));
let metadata = QueryMetadata::from(res);
let schema = metadata
.schema
.clone()
.expect("complete query should have schema");
let schema = Arc::new(Schema::new(schema));
let page_token = if metadata.page_token.is_empty() {
None
} else {
Some(metadata.page_token.clone())
};
Self {
job_service: q.job_service.clone(),
job_ref: q.job_ref.clone(),
cached_rows,
page_token,
schema,
metadata,
}
}References
- Scrutinize expensive uses of clone(). Is it necessary to copy the data? Can we move the data instead? (link)
|
|
||
| let completed = query.until_done().await?; | ||
| assert_eq!(completed.job_ref.unwrap().job_id, "some_job_id"); | ||
| assert_eq!(completed.job_ref.clone().unwrap().job_id, "some_job_id"); |
There was a problem hiding this comment.
Use .as_ref() instead of .clone() to inspect the Option without moving or cloning. This is more idiomatic and consistent with the other test at line 306.
| assert_eq!(completed.job_ref.clone().unwrap().job_id, "some_job_id"); | |
| assert_eq!(completed.job_ref.as_ref().unwrap().job_id, "some_job_id"); |
References
- Scrutinize expensive uses of clone(). Is it necessary to copy the data? Can we move the data instead? (link)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5925 +/- ##
==========================================
- Coverage 97.90% 97.88% -0.02%
==========================================
Files 234 235 +1
Lines 59940 60019 +79
==========================================
+ Hits 58683 58752 +69
- Misses 1257 1267 +10 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Towards #5844