Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msgpack-value v07 API #109

Closed
wants to merge 10 commits into from
Closed

msgpack-value v07 API #109

wants to merge 10 commits into from

Conversation

@xerial
Copy link
Member

@xerial xerial commented Jun 16, 2014

I implemented msgpack-value API for v07. Here is an overview of the API:
powerpoint___-_2014-06-17-msgpack-value_pptx

msgpack-value has the following components:

  • Cursor: traverses message-packed values. The user calls hasNext(), then next() : Value or nextRef() : ValueRef to retrieve values.
  • Value: Immutable representation of a message-packed value.
    • NumberValue : An interface for Integer and Float values. It provides number conversion methods (w/ or w/o truncation/rounding) and value range checkers.
  • ValueRef: A reference to a message-packed value. This reference is valid until cursor.next()/nextRef() is called. To make it immutable, call ValueRef.toValue() : Value, which creates a copy of the referenced value.
  • ValueHolder: An implementation of ValueRef, a sort of union (in C language) that can hold any type of a value. This is useful to reduce the cost of Value instance creation. MessageUnpacker directly feeds a value into the holder via unpackValue(ValueHolder) method.

Miscellaneous notes:

  • org.msgpack.value is now within the msgpack-core project because Cursor and ValueHolder need to interact with MessageUnpacker, and it makes difficult to separate msgpack-core and msgpack-value projects. For example, to switch the holder to use (e.g., long? or BigInteger?), adding unpackValue(ValueHolder) to the unpacker, then let it choose the target holder is the most straightforward way. And also ValueHolder can be used as an implementation of ValueRef. Moving org.msgpack.value package within msgpack-core project makes simple these interactions.
  • Traversing 10,000 binary values using Cursor.nextRef is about 2.7 times faster than using Cursor.next (returns immutable values):
    1__leo_weaver_local___work_git_msgpack-java__java_
  • MessageUnpacker:
    • Now supports counting total read bytes.
    • Added unpackValue(ValueHolder), unpackInteger(IntegerHolder), unpackFloat(FloatHolder)
  • Array/MapCursor
    • Materialising large array/map values requires huge memory. Array/MapCursor interfaces that can be obtained from Cursor provide the methods for traversing arrays and maps without constructing the whole array/map values.
    • In the draft code, these cursors extended List<Value> and Map<Value, Value> interfaces, but this design makes complicate the implementation since Java does not allow mixin of the List<Value> and Value API implementations (e.g., AbstractValue). Because of this, we needed to copy many codes to implement Array/MapCursor. Instead, I chose to provide only hasNext() and next() methods to traverse map/array values.
  • Upgraded to Scala 2.11 (Scala is used only for building and testing). TravisCI now runs sbt test
  • Added settings to deploy and release to sonatype: To sync with Maven Central, do sbt publishSigned, then sbt sonatypeRelease.
@xerial xerial modified the milestone: v07 Jun 16, 2014
BIG_INTEGER
}

private Type tpe;

This comment has been minimized.

@komamitsu

komamitsu Jun 22, 2014
Member

type is better?

This comment has been minimized.

@xerial

xerial Jun 22, 2014
Author Member

It's a private field, which is not visible to the user, but using type is OK.

return isBigInteger() ? biValue.byteValue() : (byte) longValue;
}

public short toShort() {

This comment has been minimized.

@komamitsu

komamitsu Jun 22, 2014
Member

Can you add a comment to explain the difference of toXxxx() and asXxxx? It seems that the difference is whether it allows a cast that causes overflow, though.

This comment has been minimized.

@xerial

xerial Jun 22, 2014
Author Member

NumberValue, a base class of IntegerValue and IntegerHolder, has detailed descriptions of these methods. https://github.com/msgpack/msgpack-java/blob/v07-value/msgpack-core/src/main/java/org/msgpack/value/NumberValue.java

Please look at these docs.

This comment has been minimized.

@xerial

xerial Jun 22, 2014
Author Member

@Override annotation should be here.

This comment has been minimized.

@komamitsu

komamitsu Jun 22, 2014
Member

Please look at these docs.

Thanks :)


@Override
public int hashCode() {
return isBigInteger() ? biValue.hashCode() : (int) longValue;

This comment has been minimized.

@komamitsu

komamitsu Jun 22, 2014
Member

Integer.MAX_VALUE + 1 and Integer.MIN_VALUE return the same hash?

This comment has been minimized.

@xerial

xerial Jun 22, 2014
Author Member

It could be. A better hash function should produce the same pos/negative flag to the original value. Here is the hash function of BigInteger:

    public int hashCode() {
        int hashCode = 0;

        for (int i=0; i<mag.length; i++)
            hashCode = (int)(31*hashCode + (mag[i] & LONG_MASK));
        return hashCode * signum;
    }

We should use a similar hash computation here.

@Override
public ByteBuffer toByteBuffer() {
switch(tpe) {
case STRING:

This comment has been minimized.

@komamitsu

komamitsu Jun 22, 2014
Member

What about BINARY?

This comment has been minimized.

@xerial

xerial Jun 22, 2014
Author Member

It is missing. I will add a statement for BINARY.

public class RawHolder extends RawHolderImpl {

private static class StringValueWrap extends RawHolderImpl implements StringValue {
public StringValue toValue() {

This comment has been minimized.

@komamitsu

komamitsu Jun 22, 2014
Member

Can you add Override annotation here?

This comment has been minimized.

@xerial

xerial Jun 22, 2014
Author Member

ok.

}
}

private StringValueWrap stringWrap = new StringValueWrap();

This comment has been minimized.

@komamitsu

komamitsu Jun 22, 2014
Member

final keyword.

This comment has been minimized.

@xerial

xerial Jun 22, 2014
Author Member

These wraps should be final. Thanks.

package org.msgpack.core;

/**
* Created on 5/28/14.

This comment has been minimized.

@oza

oza Jun 29, 2014
Member

Should we fix the comment to describe what this exception expresses?

This comment has been minimized.

@xerial

xerial Jun 30, 2014
Author Member

OK. I will add a description.

case Code.UINT32: // unsigned int 32
int u32 = readInt();
if(u32 < 0) {
holder.setLong((long) (u32 & 0x7fffffff) + 0x80000000L);

This comment has been minimized.

@oza

oza Jun 29, 2014
Member

Can we calculate this result by using only bit operation instead of the combination of bit operation and add operation?

(long) (u32 & 0xffffffffL)

This comment has been minimized.

@xerial

xerial Jun 30, 2014
Author Member

👍

@xerial
Copy link
Member Author

@xerial xerial commented Nov 29, 2014

�Simplified the interface by removing Cursor and ValueRef. Instead the Value interface now has toImmutable and isImmutable method to create immutable representations.

@frsyuki
Copy link
Member

@frsyuki frsyuki commented Dec 1, 2014

why not adding ImmutableXxxValue interfaces?

@oza
Copy link
Member

@oza oza commented Dec 6, 2014

@frsyuki Do you mean toImmutable methods return ImmutableXxxValue instead of XxxValue?

@frsyuki
Copy link
Member

@frsyuki frsyuki commented Dec 6, 2014

@oza that's right. ImmutableXxxValue are an interfaces extending XxxValue.

As an user, I want to know whether "this value" could be changed by other modules. With ImmutableValue and ImmutableXxxValue, I can use compiler's help to make sure this value is guaranteed to be immutable without writing unlimited amount of tests.
From this point of view, I think it's good design decision where MapValue.toMap returns non-changable map.

@frsyuki
Copy link
Member

@frsyuki frsyuki commented Dec 6, 2014

I'm assuming Cursor, MapCursor and ArrayCursor will be removed. Am I correct?
I can't find implementation of mutable values.

@oza
Copy link
Member

@oza oza commented Dec 6, 2014

Hmm, latest patch still has Cursor, MapCursor and ArrayCursor.

@xerial could you clarify this point?

@xerial
Copy link
Member Author

@xerial xerial commented Dec 9, 2014

@frsyuki OK. I'm now working on adding ImmutableValue interfaces.
And the cursor interface will be removed, but the code of ArrayCursor and MapCusor will be used as an efficient implementation of (mutable) Values. I'm also thinking about implementing ValueHolder as a value, since writing ValueHolder.get().asIntegerValue() is a little bit cumbersome. Allowing to call ValueHolder.asIntegerValue() is much simpler.

@frsyuki
Copy link
Member

@frsyuki frsyuki commented Dec 9, 2014

@xerial Thank you for the explication. Overall, I agree with you.
I think we need only one mutable Value implementation called ValueBuffer because:

  • A purpose to have mutable value is removing overhead to create a new Value instance when we unpack a value.
  • Value should be able to be any types (including array and map) because Unpacker can unpack any types.
  • Current holder/cursor implementations (FloatHolder, RawHolder, ArrayCursor, etc.) can't be different types. We need to create a new value if unpacked type different from an expected type.

My draft code implemented this idea: https://github.com/msgpack/msgpack-java/blob/v07-value-sf/msgpack-value/src/main/java/org/msgpack/value/ValueBuffer.java

A difficulty is reusing nested values to store nested elements of a array and map Value.
For example, in my draft code above, ValueBuffer needs only one ValueUnion when it stores an integer value. However, it needs multiple ValueUnion instances for each element when it stores an array value (because valueBuffer.asArrayValue().get(i) returns Value). This is an essential problem because the same problem anyways happens to deserialize nested values

A known solution is to use an object pool (ValueBufferPool). Another difficulty here is the timing to reuse nested elements. A solution is to add ValueBuffer.release(). These ideas need following complex interfaces:

  • interface ValueBufferAllocator
    • ValueBuffer allocate()
  • interface ValueBufferDeallocator
    • void deallocate(ValueBuffer unused)
  • class ValueBufferPool implements ValueBufferAllocator, ValueBufferDeallocator
  • ValueBuffer.(ValueBufferAllocator allocateFrom, ValueBufferDeallocator releaseTo)
    • ValueBuffer.() will creates a new ValueBufferPool
    • void release() { valueBufferDeallocator.deallocate(this); }
      • this could be reference-counted like Netty's ByteBuf. But could be non-reference-counted like Jetty's ByteBufferPool. I think reference-counting causes too much overhead when the value is a deeply nested map/array.

You may have another idea.

@RoSkin
Copy link

@RoSkin RoSkin commented Apr 17, 2015

How can I pack a JavaBean?I found there is no method to pack JAVABean

@xerial
Copy link
Member Author

@xerial xerial commented Jun 19, 2015

This work in progress at #246

@xerial xerial closed this Jun 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.