Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
69a9a1c
ARROW-11303: [Release][C++] Enable mimalloc in the windows verificati…
kszucs Jan 18, 2021
903b41c
ARROW-11309: [Release][C#] Use .NET 3.1 for verification
kou Jan 19, 2021
19e9559
ARROW-11315: [Packaging][APT][arm64] Add missing gir1.2 files
kou Jan 19, 2021
17a3fab
ARROW-11314: [Release][APT][Yum] Add support for verifying arm64 pack…
kou Jan 19, 2021
275fda1
ARROW-7633: [C++][CI] Create fuzz targets for tensors and sparse tensors
mrkn Jan 19, 2021
2d3e8f9
ARROW-11246: [Rust] Add type to Unexpected accumulator state error
ovr Jan 19, 2021
e20f439
ARROW-11254: [Rust][DataFusion] Add SIMD and snmalloc flags as option…
Dandandan Jan 19, 2021
18dc62c
ARROW-11074: [Rust][DataFusion] Implement predicate push-down for par…
yordan-pavlov Jan 19, 2021
127961a
ARROW-10489: [C++] Add Intel C++ compiler options for different warni…
jcmuel Jan 19, 2021
0e5d646
ARROW-9128: [C++] Implement string space trimming kernels: trim, ltri…
maartenbreddels Jan 19, 2021
f63cffa
ARROW-11305 Skip first argument (which is the program name) in parque…
jhorstmann Jan 19, 2021
7e0cb0a
ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
jorgecarleitao Jan 19, 2021
b448de7
ARROW-11216: [Rust] add doc example for StringDictionaryBuilder
alamb Jan 19, 2021
4a6eb19
ARROW-11268: [Rust][DataFusion] MemTable::load output partition support
Dandandan Jan 19, 2021
a4266a1
ARROW-11321: [Rust][DataFusion] Fix DataFusion compilation error
Dandandan Jan 19, 2021
bbc9029
ARROW-11156: [Rust][DataFusion] Create hashes vectorized in hash join
Dandandan Jan 19, 2021
8e218e0
ARROW-11313: [Rust] Fixed size_hint
jorgecarleitao Jan 19, 2021
35053fe
ARROW-11222: [Rust] Catch up with flatbuffers 0.8.1 which had some UB…
mqy Jan 19, 2021
50ba534
ARROW-11277: [C++] Workaround macOS 10.11: don't default construct co…
bkietz Jan 19, 2021
a7633c7
ARROW-11322: [Rust] Re-opening `memory` module as public
maxburke Jan 20, 2021
555643a
ARROW-11269: [Rust] [Parquet] Preserve timezone in int96 reader
nevi-me Jan 20, 2021
e7c69e6
ARROW-11279: [Rust][Parquet] ArrowWriter Definition Levels Memory Usage
Jan 20, 2021
71572bd
ARROW-11318: [Rust] Support pretty printing timestamp, date, and time…
alamb Jan 20, 2021
ed709e0
ARROW-11311: [Rust] Fixed unset_bit
jorgecarleitao Jan 20, 2021
01c5aec
ARROW-11265: [Rust] Made bool not ArrowNativeType
jorgecarleitao Jan 20, 2021
6912869
ARROW-11290: [Rust][DataFusion] Address hash aggregate performance is…
Dandandan Jan 20, 2021
23550c2
ARROW-11149: [Rust] DF Support List/LargeList/FixedSizeList in create…
ovr Jan 20, 2021
a0e1244
ARROW-11329: [Rust] Don't rerun build.rs on every file change
mbrubeck Jan 20, 2021
8b56f85
ARROW-11220: [Rust] Implement GROUP BY support for Boolean
ovr Jan 21, 2021
4601c02
ARROW-11330: [Rust][DataFusion] add ExpressionVisitor to encode expre…
alamb Jan 21, 2021
84126d5
ARROW-11323: [Rust][DataFusion] Allow sort queries to return no results
alamb Jan 21, 2021
bd90043
ARROW-10831: [C++][Compute] Implement quantile kernel
cyb70289 Jan 21, 2021
72bf95a
ARROW-11334: [Python][CI] Fix failing pandas nightly tests
jorisvandenbossche Jan 21, 2021
bc5d8bf
ARROW-11320: [C++] Try to strengthen temporary dir creation
pitrou Jan 21, 2021
c413566
ARROW-11141: [Rust] Add basic Miri checks to CI pipeline
vertexclique Jan 21, 2021
6959e46
ARROW-11337: [C++] Compilation error with ThreadSanitizer
westonpace Jan 22, 2021
499b6d0
ARROW-11333: [Rust] Generalized creation of empty arrays.
jorgecarleitao Jan 22, 2021
629a6fd
ARROW-10299: [Rust] Use IPC Metadata V5 as default
nevi-me Jan 22, 2021
457fa91
ARROW-11343: [Rust][DataFusion] Simplified example with UDF.
jorgecarleitao Jan 22, 2021
251ecac
ARROW-10766: [Rust] [Parquet] Compute nested list definitions
nevi-me Jan 22, 2021
262bbdc
ARROW-11332: [Rust] Use MutableBuffer in take_string instead of Vec
Dandandan Jan 22, 2021
37c70fb
Add from_iter_values to create arrays from (non null) values
Dandandan Jan 22, 2021
8cd118d
Remove borrow (they are primitive types anyway)
Dandandan Jan 22, 2021
3a63974
Fix comment
Dandandan Jan 22, 2021
b44a4ad
ARROW-11299: [Python] Fix invalid-offsetof warnings
cyb70289 Jan 22, 2021
13e2134
ARROW-11291: [Rust] Add extend to MutableBuffer (-20% for arithmetic,…
jorgecarleitao Jan 23, 2021
67d0c2e
ARROW-11319: [Rust] [DataFusion] Improve test comparisons to record b…
alamb Jan 23, 2021
79c92aa
Merge branch 'master' of github.com:apache/arrow into array_iter_non_…
Dandandan Jan 23, 2021
941ee5d
Use extend
Dandandan Jan 23, 2021
a37941c
Use .collect() api
Dandandan Jan 23, 2021
e448fcc
Use iterators in `to_array_of_size`
Dandandan Jan 23, 2021
69c298e
Add benchmark
Dandandan Jan 23, 2021
10f4ada
ARROW-11317: [Rust] Include the prettyprint feature in CI Coverage
alamb Jan 24, 2021
d612b0f
Use none
Dandandan Jan 24, 2021
6144a23
Use None for Microsecond as well
Dandandan Jan 24, 2021
f2c4e26
Use `None`
Dandandan Jan 24, 2021
cf7638f
ARROW-11349: [Rust] Add from_iter_values to create arrays from (non n…
Dandandan Jan 25, 2021
e07f7e5
Merge remote-tracking branch 'upstream/master' into to_array_of_size_…
Dandandan Jan 25, 2021
eddf021
Use None for strings
Dandandan Jan 25, 2021
555eb1d
fmt
Dandandan Jan 25, 2021
8a20338
Merge remote-tracking branch 'upstream/master' into to_array_of_size_…
Dandandan Jan 26, 2021
32a9e0e
Add license
Dandandan Jan 29, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions rust/datafusion/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -88,3 +88,7 @@ harness = false
[[bench]]
name = "filter_query_sql"
harness = false

[[bench]]
name = "scalar"
harness = false
30 changes: 30 additions & 0 deletions rust/datafusion/benches/scalar.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

use criterion::{criterion_group, criterion_main, Criterion};
use datafusion::scalar::ScalarValue;

fn criterion_benchmark(c: &mut Criterion) {
c.bench_function("to_array_of_size 100000", |b| {
let scalar = ScalarValue::Int32(Some(100));

b.iter(|| assert_eq!(scalar.to_array_of_size(100000).null_count(), 0))
});
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
129 changes: 105 additions & 24 deletions rust/datafusion/src/scalar.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

//! This module provides ScalarValue, an enum that can be used for storage of single elements

use std::{convert::TryFrom, fmt, sync::Arc};
use std::{convert::TryFrom, fmt, iter::repeat, sync::Arc};

use arrow::array::{
Int16Builder, Int32Builder, Int64Builder, Int8Builder, ListBuilder,
Expand Down Expand Up @@ -205,28 +205,104 @@ impl ScalarValue {
ScalarValue::Boolean(e) => {
Arc::new(BooleanArray::from(vec![*e; size])) as ArrayRef
}
ScalarValue::Float64(e) => {
Arc::new(Float64Array::from(vec![*e; size])) as ArrayRef
}
ScalarValue::Float32(e) => Arc::new(Float32Array::from(vec![*e; size])),
ScalarValue::Int8(e) => Arc::new(Int8Array::from(vec![*e; size])),
ScalarValue::Int16(e) => Arc::new(Int16Array::from(vec![*e; size])),
ScalarValue::Int32(e) => Arc::new(Int32Array::from(vec![*e; size])),
ScalarValue::Int64(e) => Arc::new(Int64Array::from(vec![*e; size])),
ScalarValue::UInt8(e) => Arc::new(UInt8Array::from(vec![*e; size])),
ScalarValue::UInt16(e) => Arc::new(UInt16Array::from(vec![*e; size])),
ScalarValue::UInt32(e) => Arc::new(UInt32Array::from(vec![*e; size])),
ScalarValue::UInt64(e) => Arc::new(UInt64Array::from(vec![*e; size])),
ScalarValue::TimeMicrosecond(e) => {
Arc::new(TimestampMicrosecondArray::from(vec![*e]))
}
ScalarValue::TimeNanosecond(e) => {
Arc::new(TimestampNanosecondArray::from_opt_vec(vec![*e], None))
}
ScalarValue::Utf8(e) => Arc::new(StringArray::from(vec![e.as_deref(); size])),
ScalarValue::LargeUtf8(e) => {
Arc::new(LargeStringArray::from(vec![e.as_deref(); size]))
}
ScalarValue::Float64(e) => match e {
Some(value) => {
Arc::new(Float64Array::from_iter_values(repeat(*value).take(size)))
}
None => Arc::new(repeat(None).take(size).collect::<Float64Array>()),
},
ScalarValue::Float32(e) => match e {
Some(value) => {
Arc::new(Float32Array::from_iter_values(repeat(*value).take(size)))
}
None => Arc::new(repeat(None).take(size).collect::<Float32Array>()),
},
ScalarValue::Int8(e) => match e {
Some(value) => {
Arc::new(Int8Array::from_iter_values(repeat(*value).take(size)))
}
None => Arc::new(repeat(None).take(size).collect::<Int8Array>()),
},
ScalarValue::Int16(e) => match e {
Some(value) => {
Arc::new(Int16Array::from_iter_values(repeat(*value).take(size)))
}
None => Arc::new(repeat(None).take(size).collect::<Int16Array>()),
},
ScalarValue::Int32(e) => match e {
Some(value) => {
Arc::new(Int32Array::from_iter_values(repeat(*value).take(size)))
}
None => Arc::new(repeat(None).take(size).collect::<Int32Array>()),
},
ScalarValue::Int64(e) => match e {
Some(value) => {
Arc::new(Int64Array::from_iter_values(repeat(*value).take(size)))
}
None => Arc::new(repeat(None).take(size).collect::<Int64Array>()),
},
ScalarValue::UInt8(e) => match e {
Some(value) => {
Arc::new(UInt8Array::from_iter_values(repeat(*value).take(size)))
}
None => Arc::new(repeat(None).take(size).collect::<UInt8Array>()),
},
ScalarValue::UInt16(e) => match e {
Some(value) => {
Arc::new(UInt16Array::from_iter_values(repeat(*value).take(size)))
}
None => Arc::new(repeat(None).take(size).collect::<UInt16Array>()),
},
ScalarValue::UInt32(e) => match e {
Some(value) => {
Arc::new(UInt32Array::from_iter_values(repeat(*value).take(size)))
}
None => Arc::new(repeat(None).take(size).collect::<UInt32Array>()),
},
ScalarValue::UInt64(e) => match e {
Some(value) => {
Arc::new(UInt64Array::from_iter_values(repeat(*value).take(size)))
}
None => Arc::new(repeat(None).take(size).collect::<UInt64Array>()),
},
ScalarValue::TimeMicrosecond(e) => match e {
Some(value) => Arc::new(TimestampMicrosecondArray::from_iter_values(
repeat(*value).take(size),
)),
None => Arc::new(
repeat(None)
.take(size)
.collect::<TimestampMicrosecondArray>(),
),
},
ScalarValue::TimeNanosecond(e) => match e {
Some(value) => Arc::new(TimestampNanosecondArray::from_iter_values(
repeat(*value).take(size),
)),
None => Arc::new(
repeat(None)
.take(size)
.collect::<TimestampNanosecondArray>(),
),
},
ScalarValue::Utf8(e) => match e {
Some(value) => {
Arc::new(StringArray::from_iter_values(repeat(value).take(size)))
}
None => {
Arc::new(repeat(None::<&str>).take(size).collect::<StringArray>())
}
},
ScalarValue::LargeUtf8(e) => match e {
Some(value) => {
Arc::new(LargeStringArray::from_iter_values(repeat(value).take(size)))
}
None => Arc::new(
repeat(None::<&str>)
.take(size)
.collect::<LargeStringArray>(),
),
},
ScalarValue::List(values, data_type) => Arc::new(match data_type {
DataType::Int8 => build_list!(Int8Builder, Int8, values, size),
DataType::Int16 => build_list!(Int16Builder, Int16, values, size),
Expand All @@ -238,7 +314,12 @@ impl ScalarValue {
DataType::UInt64 => build_list!(UInt64Builder, UInt64, values, size),
_ => panic!("Unexpected DataType for list"),
}),
ScalarValue::Date32(e) => Arc::new(Date32Array::from(vec![*e; size])),
ScalarValue::Date32(e) => match e {
Some(value) => {
Arc::new(Date32Array::from_iter_values(repeat(*value).take(size)))
}
None => Arc::new(repeat(None).take(size).collect::<Date32Array>()),
},
}
}

Expand Down