Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Tim Taylor
    @toolbear

    @orjan are you using the HollowConsumer API, or did you use the lower-level APIs to build your consuming client?

    If you have a consumer, one possibility is to keep a reference to your consumer in a field visible to your refresh listener. We recommend something like the following:

    consumer = consumerBuilder.
        …
        .build();
    customListener = new MyRefreshListener(consumer);
    consumer.addRefreshListener(customListener);
    
    consumer.getInitialLoad()
        .thenApply(l -> customListener.doSomethingOnFirstSnapshot());

    The the getInitialLoad() is required because after calling build() a background thread will likely have already started the initial load. Because addRefreshListener(customListener) comes after, that listener won’t receive a snapshotUpdateOccurred for the initial load of data.

    What you can do is refactor the body of snapshotUpdateOccurred into a separate method called from snapshotUpdateOccurred and also called when the future returned by getInitialLoad() completes.

    If you want to block on the initial load, you could also write:

    consumer = consumerBuilder.
        …
        .build();
    customListener = new MyRefreshListener(consumer, ...);  // <-- listener calls an updateLucene(index) helper also visible here
    consumer.addRefreshListener(customListener);
    
    consumer.getInitialLoad().join();
    index = UniqueKeyIndex.from(consumer, …);
    updateLucene(index);
    Dennis Levin
    @d-levin

    Has anyone successfully gotten the HollowIncrementalProducer to work together with HollowJsonAdapter? I'd like to feed JSON to the producer but HollowIncrementalProducer does not let me do this:

    producer.runCycle(state -> {
                        HollowWriteStateEngine writeEngine = state.getStateEngine();
                        HollowJsonAdapter jsonAdapter = new HollowJsonAdapter(writeEngine, dataTypeName);
    
                        jsonAdapter.populate(json);
                    });

    since its runCycle method does not take any arguments

    Tim Taylor
    @toolbear

    @d-levin yeah, the JsonAdapter preceded HollowIncrementalProducer and wasn’t really geared towards incremental usage

    Additionally, HollowProducer and HollowIncrementalProducer have a POJO/ObjectMapper bias in their current design. The WriteState passed to HollowProducer.Populator let’s you get access to lower level APIs and thus use the JsonAdapter, but the incremental producer API is more hermetically sealed.

    If, say, your use-case is ingesting JSON messages from a Kafka queue, you could use Jackson’s ObjectMapper to convert the JSON into POJOs that you feed to the incremental producer. This isn’t quite as performant as going directly from JSON to a WriteStateEngine like the JsonAdapter does, but might work for you.
    Tim Taylor
    @toolbear
    We have some changes in the works that should make this much better and still performant, but that’s probably several months out before we’re ready to support the community using it.
    Dennis Levin
    @d-levin

    @toolbear thanks for the update!

    Our initial implementation was using HollowProducer and we were actually using the JsonAdapter to avoid having to create POJOs in the first place. We basically fetched data from the database, converted it into JSON objects, then passed that into the JsonAdapter. Unfortunately we cannot grab all the data from the database on every cycle, so we started looking at the incremental producer as a solution.

    Using Kafka is on the map but for now we're simply querying a database to get records added/modified and records deleted since the last query. We instantiate POJOs with the data retrieved and pass the POJOs into the incremental producer.

    What we've been trying to do is get rid of the POJOs completely on the producer side. Sounds like we have to roll our own IncrementalProducer for now? I tried converting the DB data into FlatRecords and passing them into the incremental producer but had no luck.

    Tim Taylor
    @toolbear

    Sounds like we have to roll our own IncrementalProducer for now?

    Yeah, that’s likely your best approach in the near term. You can look at what HollowIncrementalProducer is doing under the covers.

    Also the Delta-Based Producer Input talks about the lower-level APIs you would use. “Delta-Based Producer Input” is what we called it before coining “Incremental Producer” to avoid overloading the “delta” term, but that section of the guide is the same thing as an incremental producer.

    jkade
    @jkade
    Hi again, all! I happened upon an interesting behavior that looks like a bug to me in Hollow 3.4.5. I wrote it up as a bug on the project: Netflix/hollow#399
    I'm not sure it still happens for Hollow 4.2. I plan to test when I get the chance to pull down 4.2.
    Tim Taylor
    @toolbear

    Release News

    Hollow v4.3.0 released (bintray)

    Changes:

    • miscellaneous performance improvements
    • deprecated HollowObjectHashCodeFinder and its usages
    Tim Taylor
    @toolbear

    I'm not sure it still happens for Hollow 4.2. I plan to test when I get the chance to pull down 4.2.

    Thanks for the report, @jkade. I reproduced this in 4.2 (now 4.3) and adapted your example test for #400

    Still investigating.

    jkade
    @jkade
    Thanks so much, @toolbear!
    Tim Taylor
    @toolbear

    Release News

    Hollow v4.4.0 released previously (bintray)

    • ✨ introduce similar metrics refactor in produder as was done in the consumer

    Hollow v4.5.0 released (bintray)

    • ☔ fail fast with meaningful error when ordinal or serialized size limits exceeded
    • ✨ ensure all threads started by hollow are named
    Dennis Levin
    @d-levin
    :fireworks:
    Michał Olejnik
    @blekit
    Does Hollow support incremental updates to collection (e.g. Map) in some type? In my use case I would like to add new entries to the Map which is a field in some top-level Hollow type as they arrive. I think this can be achieved if we read the previous state of the updated object and modify the map. However , I'm wondering if this can be done somewhat out of the box?
    Tim Taylor
    @toolbear

    @blekit I think what you’re asking is can you make updates just to the changed thing without having to populate the entire dataset all over again. That’s what the incubating HollowIncrementalProducer does. That’s an evolving API (but with production deployments at Target and maybe others). You can also accomplish the same results by using stable, lower-level APIs as documented under Delta-Based Producer Input

    Whichever approach (stock HollowProducer, HollowIncrementalProducer, or using the lower level APIs) when you have partial changes to a Collection the delta contains only those necessary changes.

    Because Hollow structures on the consumer-side are immutable, this does mean an entire new Map record will be added (and the old one removed). This may sound like a a lot, but any keys or values that are unchanged from the old map aren’t retransmitted. Only the new key or values + the map itself will be in the delta.

    Michał Olejnik
    @blekit
    @toolbear thanks for quick answer, but I meant something slightly different. We're already using HollowIncrementalProducer as we receive our data as events and don't want to scan through entire source of truth every time we need to produce snapshot. However, we've encountered a problem that for one particular type we have in our model we might run out of ordinals. One of considered solutions for this, made possible by the latest fix to HashMaps and HashSets, is to flatten our model and represent the problematic type as a HashMap in its owning entity. However, this poses the problem I was asking about previously - the data updates for the type that we would like to store in HashMap arrive as independent events and at the moment of receiving the new one we don't have access to the ones that came earlier. We can create an instance of owning type and add the new event to the map, but we're wondering how would it behave once such object is passed to Hollow Producer? Would we end up with the state where there is only a new entry in the map on Consumer side or is there a way to just add it to the currently existing one?
    Tim Taylor
    @toolbear

    @blekit thanks for explaining.

    No, there isn’t a way on the producer side to do partial modifications of a map object that was in the previous state.

    One possible workaround…

    If I understand your description of your data model, it resembles something like this:

    @HollowPrimaryKey(fields = "a1")
    class Alpha { // root type
      Long a1;
      Map<String, Bravo> a2;
      ...
    }
    
    class Bravo {
      Integer b3;
      Charlie b4;
      …
    }

    Intead, make Bravo a root type, eliminate Map<String,Bravo>, and use a HashIndex for lookups instead of map.get. The data model might look like:

    @HollowPrimaryKey(fields = "a1")
    class Alpha { // root type
      Long a1;
      Map<String, Bravo> a2;
    }
    
    class Bravo {
      Long b1; // “foreign” key to Alpha.a1
      String b2; // formerly Alpha.a2.key
      Integer b3;
      Charlie b4;
      ...
    }

    Then construct a HashIndex for Bravo with match paths “b1.value”, “b2.value”.

    If there’s only 1 Alpha in your dataset, then you don’t even need Bravo.b1.
    sdelarbre
    @sdelarbre
    Hey guys, I was trying to start the demo from https://hollow.how/quick-start/. But when trying to start it (after updating some build.gradle items) i always get:
     Stijns-MacBook-Pro:hollow-reference-implementation stijndelarbre$ javac src/main/java/how/hollow/producer/Producer.java
    src/main/java/how/hollow/producer/Producer.java:20: error: package com.netflix.hollow.api.consumer.HollowConsumer does not exist
    build.gradle:
    buildscript {
      repositories { jcenter() }
    }
    
    plugins {
      id 'nebula.netflixoss' version '5.1.1'
    }
    
    plugins {
        id 'java'
    }
    
    sourceCompatibility = '1.8'
    targetCompatibility = '1.8'
    
    
    apply plugin: 'java'
    compileJava {
        sourceCompatibility = 1.8
        targetCompatibility = 1.8
    }
    
    repositories {
      jcenter()
    }
    
    dependencies {
        compile 'com.netflix.hollow:hollow:3.0.1'
        compile 'com.netflix.hollow:hollow-diff-ui:3.0.1'
        compile 'com.netflix.hollow:hollow-explorer-ui:3.0.1'
        compile 'org.eclipse.jetty:jetty-server:9.4.2.v20170220'
    
        compile 'com.amazonaws:aws-java-sdk-s3:1.11.49'
        compile 'com.amazonaws:aws-java-sdk-dynamodb:1.11.49'
    
        compile 'commons-io:commons-io:2.4'
    
        testCompile 'junit:junit:4.11'
    }
    
    task( producer, dependsOn: jar, type: JavaExec ) {
        main = 'how.hollow.producer.Producer'
        classpath = sourceSets.main.runtimeClasspath
    }
    
    task( consumer, dependsOn: jar, type: JavaExec ) {
        main = 'how.hollow.consumer.Consumer'
        classpath = sourceSets.main.runtimeClasspath
    }
    sdelarbre
    @sdelarbre
    Or when starting from the build directory:
    Stijns-MacBook-Pro:hollow-reference-implementation stijndelarbre$ cd build/classes/java/main/
    Stijns-MacBook-Pro:main stijndelarbre$ java how.hollow.producer.Producer
    Error: Unable to initialize main class how.hollow.producer.Producer
    Caused by: java.lang.NoClassDefFoundError: com/netflix/hollow/api/producer/HollowProducer$Publisher
    Manuel Quinones
    @manuelq
    Question about if anyone here has done anything with scheduling when a snapshot is created in a producer vs number of deltas. Is there a hook in the producer to trigger a snapshot to occur on the next cycle? Asking due to size concern around. We want to be able to create deltas during the day and then have the snapshot created during the night for bandwith reasons.
    sunray2003
    @sunray2003
    I am considering Hollow for caching 5 GB of Json Schema data. Consumers would validate incoming Json messages using cached json schema. What datamodel would you recommend for storing Json Schema data? Can it be stored as a String or a Jackson JsonObject? Are are there any rule of thumb on the memory requirements for caching 5 GB of data on disk?
    Abhishek
    @abhishek
    Hello! I am trying to do close the loop while using netflix hollow. Any pointers on that?
    I am thinking I could have each of my 30 clients check in a central location with their latest version number and that would allow me to know every second, what the least version number on the clients is. However I am trying to figure out in which version a specific key was updated.
    Abhishek
    @abhishek
    I am trying to read the blob in HollowConsumer listener, since that's just raw data. Need to pass that through something (not sure what yet)
    that's what I am digging in at the moment.
    Abhishek
    @abhishek
    This place seems pretty unfrequented.
    jkade
    @jkade
    It has been a little quiet lately, @abhishek. I don't quite understand your question, though. It looks like a lot of questions recently asked have pretty easily discoverable answers.
    @toolbear, I noticed the addition of the DuplicateDataDetectionValidator. I think I mentioned before that I was able to produce snapshots with multiple HollowObjects with different ordinals and the same primary key. Detecting them and preventing them from getting out to publication is great, but is there any way to remove duplicates that have already gotten in, particularly in an IncrementalProducer situation?
    jkade
    @jkade

    Oh, I guess the validator is not new, just the copyright header :)

    Nevertheless, I am curious...

    yogesh-hce
    @yogesh-hce

    Hi Guys, I am facing a very weird problem in all my micro-services that a hollow update fails with a 'java.lang.OutOfMemoryError: Java heap space' exception.

    The microservice is still working fine and the amount of free heap space is after the failure, still about 400 MB. However subsequent updates (delta updates) also fail, but with the message 'Update plan contains known failing transition!'.

    Did anyone face this issue?
    Let's explain what happens when updating hollow with snapshots and deltas. In the next example snapshots are called A, B, C, D and the deltas, A1, A2, B1,C1, D1 etc.
    The service starts with the most recent version, let's assume this is a snapshot A. It loads this into the memory. After a while we get updates A1 and A2. This is working fine. Then we need to go to B, this might work fine, and also update to C and C1 works. Then we receive update D. It fails with a OOM, so we are still at C1. When receiving D1 the update will fail with the message 'Update plan contains known failing transition!', because to get to D1 from C1 the update plan of hollow is C1 → D → D1, and since hollow remembers that the update from C1 to D fails, it won't do the update.
    Please assist if anyone has any idea on this.
    Kevin McLarnon
    @kmclarnon
    Has anyone ever encountered errors with hash or primary key indices getting in a bad state due to a schema change? I have a running consumer reading the current state, and then the current state was updated to add another field to one of the object types. After that the primary key index started returning incorrect ordinals as matches
    Adam Keyser
    @adamkeyser

    Hey - got a kind of silly question. Looking at -> https://github.com/Netflix/hollow/blob/b0c64ccb2386b76fe144c2db5a441c7ef1854409/hollow/src/main/java/com/netflix/hollow/core/memory/ByteArrayOrdinalMap.java#L32

    Mostly curious why the ordinal and it's respective lookup are comprised of the same Long - is that to preserve space , or enhance performance? What kind of performance impact would there be if you changed it to BigInteger?

    Adam Keyser
    @adamkeyser
    Or a UUID?
    Drew Koszewnik
    @dkoszewnik
    Hi @adamkeyser, this uses an atomic long array both for space and time efficiency.
    Adam Keyser
    @adamkeyser
    @dkoszewnik Thanks - we recently bumped into the 532m ordinal limit and wondered what sense it would be to change it.
    Drew Koszewnik
    @dkoszewnik
    @adamkeyser wow, interesting. How did you solve this in the short term?
    If you will experiment with another implementation of ByteArrayOrdinalMap, you might want to use an AtomicReferenceArray of a bean that contains a separate int and a long value for these two components (ordinal and offset).
    Adam Keyser
    @adamkeyser
    @dkoszewnik Right now we offer a wrapper on top of hollow to automate the creation of distributed apis (geographically). In this event we had to reject the request. I'd be pretty interested in exploring your suggestion above, but I'm not entirely what kind of cost you'd run up against. At that data set size updates start to take a fair bit of time from the producer standpoint - so I'm trying to figure out if that's a rational end to what size of dataset hollow should provide.
    Andrew Riedel
    @delta50
    This might be a silly question, but is there a way to convert from the hollow generated pojos back to the original pojo?
    Drew Koszewnik
    @dkoszewnik
    Hi @delta50, I'm assuming you mean the generated Hollow API used to access the data on a consumer. In order to do this, you'll have to copy the data from your Hollow data state back into the POJOs. Currently there is no tool available to assist with this.
    I posted this article to the Netflix Tech Blog this morning: https://medium.com/netflix-techblog/re-architecting-the-video-gatekeeper-f7b0ac2f6b00
    yupegom
    @yupegom

    Hello everybody. I'm new to Hollow and I have some doubts. According the docs:

    Once the data has been populated into a producer, that producer's state engine is aware of the data model, and can be used to automatically produce a client API.

    However in the code they do this mapper.initializeTypeState(Movie.class);. What if my consumer is not in the same place where my data model is? I mean, what if I don't have that class (Movie.class) in my sub-domain/component? Or is a must and that means that I have to share the data model across the board to be able to consume data?

    Fuyang Liu
    @liufuyang
    Hi there, I have a question about trying to force producer to make a new snapshot from the current dataset. It seems working on the producer side however the running consumer gets into a stage where it seems that the consumer is still holding the old snapshot's data. And when restarting the consumer it still continue to load the old snapshot, not from the newly created snapshot.
    Is it because we did something wrong on forcing the producer to make a new snapshot?
    Fuyang Liu
    @liufuyang
    public boolean forceSnapshot() {
        HollowProducer temporaryProducer = producerBuilder.build();
        HollowWriteStateEngine writeEngine = temporaryProducer.getWriteEngine();
        writeEngine.prepareForNextCycle();
    
        final HollowBlobWriter blobWriter = new HollowBlobWriter(writeEngine);
        final List<T> entities = getUpdatedEntitiesAfter(0L);
        final HollowFilesystemBlobStager stager = new HollowFilesystemBlobStager();
        final Long versionOfSnapshot = Math.max(getMaxUpdatedTimestamp(entities), currentVersion) + 1L;
        final HollowProducer.Blob blob = stager.openSnapshot(versionOfSnapshot);
    
        entities.forEach(temporaryProducer.getObjectMapper()::add);
    
        try {
          OutputStream output = new BufferedOutputStream(Files.newOutputStream(blob.getPath()));
          blobWriter.writeSnapshot(output);
          publisher.publish(blob);
          announcer.announce(versionOfSnapshot);
          incrementalProducer.restore(versionOfSnapshot, retriever);
          this.currentVersion = versionOfSnapshot;
        } catch (IOException e) {
          LOG.error("There was an error while reindexing a new snapshot.", e);
          return false;
        }
    
        return true;
      }
    Basically we have some code that looks like this on the producer side to do the force snapshot.
    Fuyang Liu
    @liufuyang
    oh, we figured out that it wasn't related to double snapshot. Basically it was because we only did hollowConsumer.getAPI() once and using that API all the time which is not updated. The solution is that we should do hollowConsumer.getAPI() on every cases where we need the API.