Skip to content
This repository was archived by the owner on Aug 29, 2023. It is now read-only.

Lucene qat support only#11

Open
ZJie1 wants to merge 13 commits intoIntel-bigdata:masterfrom
ZJie1:LUCENE_QAT_SUPPORT_ONLY
Open

Lucene qat support only#11
ZJie1 wants to merge 13 commits intoIntel-bigdata:masterfrom
ZJie1:LUCENE_QAT_SUPPORT_ONLY

Conversation

@ZJie1
Copy link
Copy Markdown
Contributor

@ZJie1 ZJie1 commented Feb 28, 2020

No description provided.

Copy link
Copy Markdown

@winningsix winningsix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments left.

@@ -0,0 +1,588 @@
diff --git a/lucene/build.xml b/lucene/build.xml
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is intended for Lucence side to apply. We should provide version info to it. You can do it by creating related version number as folder name like what we did for Parquet/Hive.

+ public QatCompressionCompressingCodec() {
+ // we don't worry about zlib block overhead as it's
+ // not bad and try to save space instead:
+ this(61440, 512, false, 1024);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid magic number here.

+ bytes.length = decompressor.decompress(bytes.bytes, bytes.length, originalLength);
+ } catch (Error e) {
+ String s = e.getMessage();
+ System.out.println(e.getMessage());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log instead of sys out.

+ final int paddedLength = compressedLength;
+ compressed = ArrayUtil.grow(compressed, paddedLength + 1);
+ in.readBytes(compressed, 0, compressedLength);
+ compressed[compressedLength] = 0; // explicitly set dummy byte to 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do Padding here? Could you refer a link?

+ // pad with extra "dummy byte": see javadocs for using Inflater(true)
+ // we do it for compliance, but it's unnecessary for years in zlib.
+ final int paddedLength = compressedLength;
+ compressed = ArrayUtil.grow(compressed, paddedLength + 1);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resizing is very expensive operation for array. What's the main purpose to hold the compressed as a local variable? We can just create it on the fly.

+ compressor.finish();
+
+ int totalCount = 0;
+ for (; ; ) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use while loop with !compressor.finished as the condition.

+ for (; ; ) {
+ final int count = compressor.compress(compressed, totalCount, compressed.length - totalCount);
+ totalCount += count;
+ assert totalCount <= compressed.length;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this assert? What's case for compressed array length bigger than total count?

<maven-source-plugin.version>3.0.1</maven-source-plugin.version>
<maven-clean-plugin.version>3.0.0</maven-clean-plugin.version>
<maven-javadoc-plugin.version>2.10.4</maven-javadoc-plugin.version>
<!--<maven-assembly-plugin.version>3.0.0</maven-assembly-plugin.version>-20190909 by zj-->
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to exclude this maven plugin?

import java.nio.ByteBuffer;


/**
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove those comments.

}
}

private int directBufferSize;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move those parameters declarations before L49.

// Ignore failure to load
if(LOG.isDebugEnabled()) {
LOG.debug("Failed to load native-qat with error: " + t);
LOG.debug("java.library.path=" +
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need to output this msg here?

nativeCodeLoaded = true;
} catch (Throwable t) {
// Ignore failure to load
if(LOG.isDebugEnabled()) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this condition since it's already debug level.

* Closes the compressor and discards any unprocessed input.
*/

public synchronized void end() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please throw unsupport operation exception here.

* @param len Length
*/

public synchronized void setInput(byte[] b, int off, int len) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to make it thread safe? I am thinking it's initialized and will not be reused.

uncompressedDirectBuf = ByteBuffer.allocateDirect(directBufferSize);
}
try {
compressedDirectBuf = (ByteBuffer) nativeAllocateBB(directBufferSize,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to use two separated buffer? Is it possible to reuse the compressor in the same time for both de/compression?

private Buffer uncompressedDirectBuf = null;
private byte[] userBuf = null;
private int userBufOff = 0, userBufLen = 0;
private boolean finished;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is thread-safe, we should make those variables volatile.

* @param forcePinned
* @param numa
*/
public QatDecompressorJNI(int directBufferSize, boolean useNativeAllocateBB,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid put too many logics into JNI class. It make it hard to reuse the code.


private static final Logger LOG = LogManager.getLogger(QatCompressorJNI.class);
private static final int DEFAULT_DIRECT_BUFFER_SIZE = 640 * 1024;
private static final int DEFAULT_DIRECT_BUFFER_SIZE = 64 * 1024;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did it work in our end-2-end test?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants