## Where communities thrive

• Join over 1.5M+ people
• Join over 100K+ communities
• Free without limits
##### Activity
Mike Muske
@mikemuske
@dkoszewnik I saw your note about an additional spot where there may be an overflow. I've looked at it a little and it appears to me like you're right. https://github.com/Netflix/hollow/blob/7ff25c9b3113d731341ca203ee81a46d7ab46cdc/hollow/src/main/java/com/netflix/hollow/core/write/HollowListTypeWriteState.java#L226-L227
Did you confirm it's not a problem?
Dillon Beliveau
@Dillonb

I'd love to see anything you're willing to share about your testing on ARM in addition to this PR Netflix/hollow#503 though :)

I'll try to spend some time this weekend hacking on this to see if I can get something together.

Drew Koszewnik
@dkoszewnik
@mikemuske after posting that, I realized the integer written there is the number of 64-bit longs required to write all of the element data, not the number of elements. So you should be ok there.
Mike Muske
@mikemuske
well, that will be a little smaller, but won't it still exceed 32 bits?
For example, the dataset I'm attempting to load now has 6.6 billion total array elements, 6.6 billion x 33 bits = 217,800,000,000, and dividing by 64 gives 3,403,125,000 which still overflows Integer.MAX_VALUE
Drew Koszewnik
@dkoszewnik
Your references are 33 bits each?
Mike Muske
@mikemuske
i was thinking they'd have to be... but maybe that's where i went wrong
Drew Koszewnik
@dkoszewnik
The number of bits required for each element will only be the number required to represent the max ordinal of the referenced type in your dataset.
Mike Muske
@mikemuske
got it, so is that 29 bits then?
Drew Koszewnik
@dkoszewnik
So it depends what the cardinality of your referenced type is, if you only have 1,000 unique elements, then you'll need 10 bits.
Mike Muske
@mikemuske
i see
Drew Koszewnik
@dkoszewnik
6.6 billion seems very large, pushing the limits here -- maybe there's some way to remodel the data?
Mike Muske
@mikemuske
I guess in this case we are expecting about 121 million strings, so only need 27 bits, but it will still overflow.
Yeah, it's a ridiculous dataset. I think we can shard the arrays into 4 separate spaces, so that will make it fit. But, it seems like we could fix this anyway.
Sunjeet
@Sunjeet
Logged this issue with the little data that we were able to gather from our ARM testing Netflix/hollow#517 @Dillonb
Dillon Beliveau
@Dillonb
Thanks!
Dillon Beliveau
@Dillonb
Hey @Sunjeet, curious to hear your thoughts on Netflix/hollow#518 when you have a chance
mailsanchu
@mailsanchu
Hi All, Is there a way to get the written object using mapper.add(node) back?
2 replies
MichaĹ‚ Olejnik
@blekit
Hello guys, probably a silly question from me. I'm trying to build Hollow and I'm getting a few errors related to missing sun.misc.Unsafe (neither building from console nor from IntelliJ works). I currently have JDK 11 and JDK 15 installed on my machine. Can someone point me towards change I need to make to build the project?
I want to use Hollow for a small set of data. It is around 50 entries. The idea of using Hollow is because I don't want to restart my application all the time I change this data. This data changes weekly. Would Hollow be a good solution for that?
mil0
@mil0

I understand from reading this Gitter that the time it takes to apply deltas and create a Hollow blob is proportional to the size of the dataset and not the size of the deltas being applied. In production, we see this cycle publish time taking upwards of 2 minutes at present using the HollowProducer.Incremental class, and it grows over time as our snapshot gets larger and larger (we typically don't need to delete data from the dataset). 2 minutes however seems high. We did upgrade our producer hosts to a larger host size with a better CPU (newer EC2 generation) and found that the publish time reduced slightly, but now it is back to where it was. This is problematic for us from a throughput perspective - we ingest changes from a stream and it frequently backs up if the stream encounters increased volume.

We're working on profiling the service to see exactly what is taking so long, but I figured it may be beneficial to ask here:

Are there any tuning parameters (like GC or Hollow config changes), or options to leverage concurrency we could use in order to get this time down? Again, 2 minutes seems high.

Drew Koszewnik
@dkoszewnik

@mil0 @adwu73 how large are your datasets? We have one in particular that is very large and was taking about 2 minutes to publish. For this dataset, I turned off integrity checking (.noIntegrityCheck() when building the HollowProducer) to significantly reduce the publish time.

The integrity check is a failsafe which loads two copies of the data and validates the delta and reverse delta artifacts with checksums. If some unexpected edge case in the framework itself were to be triggered by your dataset, this is where it would be caught.

It's a judgement call whether you want to turn it off. I felt confident in doing so since this particular dataset is far enough back in our content ETL pipeline that it is never directly consumed by our instances responsible for serving member requests, and any unanticipated errors would be caught by a later stage in the pipeline.

2 replies
Drew, thanks for quick reply. For our case, the snapshot is around 350M, the runIncrementalCycle cost around 3s in a 4c16g system. We have closed IntegrityCheck, it works just fine. Since we want to publish delta every 500ms, so we are trying to squeeze this 3s to minimum. I know we are pushing Hollow to a domain that it's not originally designed. I just want to get some help on which direction we can try? I noticed to possibilities after reading through the code. 1. I noticed that when prepareForNextCycle, each TypeWriteState will call ordinalMap.compact(currentCyclePopulated). Can I reduce the frequency of this compact? same 10%, will this hurt anything given we will runIncrementalCycle every seconds? 2. I found Producer can be assigned as Primary or not Primary? what is the designed use case for multiple Producers? can this be used to spend up the delta publish process?
Allan Boyd
@allanmboyd_gitlab
Hi. I wondering what the best or recommended ways are of understanding in code what changes are either about to be applied or have just been applied either within a hollow producer or a hollow consumer. For example if my producer reads from its data source either a the complete data set or a delta of changes, then for each object that has changed how to understand the properties that have changed and their new values - either in a producer or a consumer. Any suggestions? (Thanks)
Fisher Evans
@fisherevans

Howdy. I have a hollow datastore that is fairly static, I'd like to be able to initialize a hollow consumer within an AWS lambda, and manually manage refreshing the store as part of invocation. Is this a pattern that anyone has attempted?

I'm running into issues creating a client updater that does not rely on self-created background threads. I feel like i can kind of hack the refresh (and short circuit the supplied executor) - but I can't figure out a way to self-manage the StaleHollowReferenceDetector thread.

Is it even worth trying this? Would I need to write my own consumer/updater?

rajakumarjk1
@rajakumarjk1

Hi,
When I import hollow-reference-implementation gradle project into my Eclipse workspace after cloning it from https://github.com/Netflix/hollow-reference-implementation, I am
getting below error.

FAILURE: Build failed with an exception.

• Where:

• What went wrong:
Plugin [id: 'nebula.info', version: '3.6.0'] was not found in any of the following sources:

• Plugin Repositories (could not resolve plugin artifact 'nebula.info:nebula.info.gradle.plugin:3.6.0')
Searched in the following repositories:

• Try:

Run with --stacktrace option to get the stack trace.
Run with --info or --debug option to get more log output.
Run with --scan to get full insights.

• Get more help at https://help.gradle.org

Deprecated Gradle features were used in this build, making it incompatible with Gradle 8.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

CONFIGURE FAILED in 41ms

Configure project :
Inferred project: hollow-reference-implementation, version: 0.1.0-SNAPSHOT

FAILURE: Build failed with an exception.

• Where:

• What went wrong:
An exception occurred applying plugin request [id: 'nebula.netflixoss', version: '3.5.2']

Failed to apply plugin class 'nebula.plugin.info.dependencies.DependenciesInfoPlugin'.
Could not create plugin of type 'DependenciesInfoPlugin'.
No signature of method: org.gradle.api.internal.artifacts.ivyservice.ivyresolve.strategy.DefaultVersionComparator.asStringComparator() is applicable for argument types: () values: []

  Possible solutions: asVersionComparator()
• Try:

Run with --stacktrace option to get the stack trace.
Run with --info or --debug option to get more log output.
Run with --scan to get full insights.

• Get more help at https://help.gradle.org

Deprecated Gradle features were used in this build, making it incompatible with Gradle 8.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

CONFIGURE FAILED in 146ms

Could you please help to fix above issue to import the hollow project in to my Eclipse workspace?
Yoni BenDayan
@yonibendayan

Hey all,
Im want to restore a producer from an exsisting running consumer like this:

 producer.getValue().getWriteEngine().restoreFrom(consumer.getStateEngine());

and when a new cycle is starting that is updating exsisting key, the data gets duplicated instead of updated.
up until now we are successfully using this function:

producer.restore(announcementWatcher.getLatestVersion(),blobRetriever));

but this function creates a consumer by itself and i want to restore the producer from another consumer.

The main difference I saw is that the restore function is that at the end of the restore if it is successful the object mapper of the producer will replace it's write state to the new one, but since it's a private param i'm unable to do so myself.

my question is if there is a better way to achieve that or should I open an issue?
Thanks!

mailsanchu
@mailsanchu
@Sunjeet We are seeing a very decent performance from SHARED_MEMORY_LAZY blob. Do you see any issues using that in production?
19 replies
Henry Mai
@maiakhoa
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
at com.netflix.hollow.api.producer.HollowIncrementalCyclePopulator.populate(HollowIncrementalCyclePopulator.java:53) ~[golftec-api-1.0-jar-with-dependencies.jar:na]
at com.netflix.hollow.api.producer.HollowProducer.runCycle(HollowProducer.java:438) [golftec-api-1.0-jar-with-dependencies.jar:na]
at com.netflix.hollow.api.producer.HollowProducer.runCycle(HollowProducer.java:390) [golftec-api-1.0-jar-with-dependencies.jar:na]
at com.netflix.hollow.api.producer.HollowIncrementalProducer.runCycle(HollowIncrementalProducer.java:206) [golftec-api-1.0-jar-with-dependencies.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_292] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_292] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_292] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_292]
Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
at com.netflix.hollow.core.util.SimultaneousExecutor.awaitSuccessfulCompletion(SimultaneousExecutor.java:118) ~[golftec-api-1.0-jar-with-dependencies.jar:na]
... 10 common frames omitted
Caused by: java.lang.NullPointerException: null
at com.netflix.hollow.core.write.objectmapper.HollowObjectTypeMapper.write(HollowObjectTypeMapper.java:170) ~[golftec-api-1.0-jar-with-dependencies.jar:na]
at com.netflix.hollow.core.write.objectmapper.HollowMapTypeMapper.write(HollowMapTypeMapper.java:76) ~[golftec-api-1.0-jar-with-dependencies.jar:na]
at com.netflix.hollow.core.write.objectmapper.HollowObjectTypeMapper$MappedField.copy(HollowObjectTypeMapper.java:470) ~[golftec-api-1.0-jar-with-dependencies.jar:na] at com.netflix.hollow.core.write.objectmapper.HollowObjectTypeMapper.write(HollowObjectTypeMapper.java:176) ~[golftec-api-1.0-jar-with-dependencies.jar:na] at com.netflix.hollow.core.write.objectmapper.HollowObjectMapper.add(HollowObjectMapper.java:70) ~[golftec-api-1.0-jar-with-dependencies.jar:na] at com.netflix.hollow.api.producer.WriteStateImpl.add(WriteStateImpl.java:41) ~[golftec-api-1.0-jar-with-dependencies.jar:na] at com.netflix.hollow.api.producer.HollowIncrementalCyclePopulator$2.run(HollowIncrementalCyclePopulator.java:136) ~[golftec-api-1.0-jar-with-dependencies.jar:na]
... 5 common frames omitted
I got this error in the producer production
anyone can guess any reason to help me on this matter

I've got a hollow dataset that ideally I'd split between a "hot" set of current data (eg non-archived, non-expired, "active" records), and a larger set of "archived" data that's only of interest to some clients. As an analogy, think of a catalog of items in an online store, many of which are no longer offered for sale, but you still need to maintain records to resolve data about historical orders.

I'm looking at some of the filtering/splitting options (https://hollow.how/tooling/#dataset-manipulation-tools), but I'm not sure I can see a way to make them work - in my case, it's about having a smaller set of records for the same types, rather than excluding specific types or fields.

The more heavy handed option is to just create two entire hollow data sets, with two producers, which can share the same model. Which will work, but you lose the flexibility of letting clients decide how they filter. Before I go down this path, just wondering if anyone else has used the filtering/combining tools for this use case?

2 replies

I was extremely dismayed to discover this week that the producer validation listeners (eg DuplicateDataDetectionValidator) run after content has been written out to the persisted blob store.

Although it did prevent the faulty version from being announced, the resulting cleanup has proved hard enough that we've given up and will just create a brand new blob store and get all clients to switch.

Although this post-write validation behaviour is actually documented, it's extremely surprising and greatly reduces the usefulness of the validators.

Drew Koszewnik
@dkoszewnik
@adrian-skybaker can you please help me understand why this resulted in a difficult cleanup? CB failures happen very often for us, and we find it useful to be able to retrieve the "bad" blobs from the blob store. Once the underlying issue is corrected, the next produced state will have a delta from the last announced state (artifacts for which are also still around in the blob store). We never actually do any sort of manual cleanup of the blob store after this event.
2 replies
Sahith Nallapareddy
@snallapa
hello I am wondering is there some sort of delay that occurs after applying a delta? i have a hollow dataset that use daily producer jobs. In the consumer, we use the unique key index to query the dataset. Everyday I can see the logs that the deltas are applied, however sometimes it seems that when the consumer queries the data for keys that should exist, it returns null. I setup a test consumer and queried the snapshot myself and found that the key does exist. After restarting our consumer we find that the consumer does find the data this time. Wondering if we waited longer would the consumer eventually be able to query the key? or could there be some weird bug in our implementation of some of the hollow classes
2 replies
Sahith Nallapareddy
@snallapa
similarly are there cases where a delta is applied but unique key index do not update? i am finding weird issues with longer running consumers. Keys do not seem to be found, but I have confirmed that deltas were applied and that the snapshot itself contains those keys
3 replies
graham_ren
@graham_ren:matrix.org
[m]

Hello, I am using Hollow:7.1.1
producer init:
val producer = HollowProducer.withPublisher(publisher).withAnnouncer(announcer)
.withNumStatesBetweenSnapshots(5)
.buildIncremental()
write data:
s3Producer.runIncrementalCycle { writer ->
}

I have encountered such error: Caused by: java.io.IOException: Attempting to apply a delta to a state from which it was not originated!

Can someone help tell me how to fix this?

Jeeukrishnan Kayshyap
@Jeeukrishnan
Hi.
I am getting hollow files to generated at runtime , after these files get generated,I can use consumer because it is using .withgeneratedAPIClass() function. How can I get rid of this function or how can I generate entire Hollow files in compile time itself?
Jeeukrishnan Kayshyap
@Jeeukrishnan
And why Netflix decided to generate custom apis after execution of producer code ? What can be their use case ? Is there any ways to generate these apis at compile time?
Jeeukrishnan Kayshyap
@Jeeukrishnan
Hi , any sort of help anyone can provide?
Alexandru-Gabriel GherguČ›
@AlexandruGhergut
@Jeeukrishnan check out this plugin https://github.com/nebula-plugins/nebula-hollow-plugin. It adds a gradle task that you can call at build time to generate the API before compilation. If you don't want to use the plugin, you could probably write some custom Java code yourself that you can call at build time
mailsanchu
@mailsanchu
@Jeeukrishnan Did you google this before posting it here?
Jeeukrishnan Kayshyap
@Jeeukrishnan
@mailsanchu yes I did sir
@AlexandruGhergut Hi, thanks for your response,But could you please help me with maven. I tried googling and tried using dependency that I found , but it seems APIs are still not getting generated, please give some guidance!
mailsanchu
@mailsanchu
@Jeeukrishnan Need to improve your googling skills. here is an example with my skills https://github.com/IgorPerikov/hollow-maven-plugin-examples/blob/master/single-module-example/pom.xml
learnerjava830
@learnerjava830
What is the best open source tool/software for performance verfiying to identify performance bugs in Java?
learnerjava830
@learnerjava830

Text Over Image with Java Web Application

I want to display an image in the web application where user can add text on the image.

Finally i need to save in DB, later user has to view the editable text and edit if required

How to achieve this in java web application - UI? back-end? DB (json or image or co-ordinates) ?

Does any opensource can be used in all the levels? Can someone suggest some comments/feedback

learnerjava830
@learnerjava830

URL url = new URL(...); --> FAILS here when i try to download a https image - "javax.imageio.IIOException ... "Can't get input stream from URL!""

Note:
URL works from browser
URL works in standalone program
URL fails when used in java web application

Question:

1. Is any special treatment required to access https image from web app?
2. How does it works in standalone program but not via web app though the certs are not installed in local machine?

What is the correct/right approach and what is the underlying differences?

Thanks