Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LRU cache guarantee to keep size under limit #3510

Conversation

xiaoyao1991
Copy link
Member

@xiaoyao1991 xiaoyao1991 force-pushed the lru_cache_guarantee_to_keep_size_under_limit branch from 6c117fb to c0f8b08 Compare September 26, 2016 06:04
);
numBytes.addAndGet(key.remaining() + value.length);

if (numBytes.get() > sizeInBytes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "if" condition since it's covered by "while".

Copy link
Contributor

@guobingkun guobingkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@binlijin
Copy link
Contributor

LGTM

@ben-manes
Copy link

Alternatively, ConcurrentLinkedHashMap is a concurrent, LRU cache that supports weighted entries. That might be a good stop-gap to offer much of the Caffeine-based extension's performance, except for the improved eviction policy. The cache was fairly successful, e.g. adopted by Cassandra and ported into Guava (though at a large perf loss). It is also Java 6 compatible.

@xiaoyao1991
Copy link
Member Author

@ben-manes Thanks for the suggestion. Is it easy to implement the eviction policy in ConcurrentLinkedHashMap?

@ben-manes
Copy link

ben-manes commented Sep 27, 2016

CLHM is an LRU policy on top of ConcurrentHashMap. Its intentionally bare bones because it took me a long time to stumble on the right design ideas. The challenge with LRU is that every read is a write to global, shared state. The solution is to use a database style commit log to record and replay events. We then tried to retrofit the ideas onto Guava and I rewrote it in Caffeine.

So if you're asking if its easy to use here for an LRU cache, then yes.

ConcurrentMap<K, V> cache = new ConcurrentLinkedHashMap.Builder<K, V>()
    .maximumWeightedCapacity(1000)
    .weigher(...)
    .build();

If you're asking if its easy to implement the ideas from scratch, then its moderate. But doing it with all the features and tuning in Caffeine was hard. However that also replaced LRU with a modern policy, which is one of the reasons why Druid 0.9.2 adopted it as an extension. Maybe when Druid is JDK8-based they'll make it the default.

@drcrallen
Copy link
Contributor

IMHO we shouldn't try to reinvent the wheel here. The local cache in druid has tons of problems that make it unsuitable once you get to any appreciable scale, which is why I took @ben-manes 's advice and checked out Caffeine, which is working great for us.

The simple local cache does, however, provide a great way to spin up a mild druid cluster with minimal overhead.

@drcrallen
Copy link
Contributor

I'm not saying it shouldn't be fixed, but I think focus should be on getting the default local cache more functional and usable, rather than trying to write the-next-big-cache.

@ben-manes
Copy link

@drcrallen oh I don't think anyone is suggesting that :)

It was unclear whether @xiaoyao1991's question was about usage or implementation. LHM is extended to become a cache, whereas CLHM simply is one (as any other form of a concurrent LHM makes little sense). But I suspect he meant the former rather than the internal details.

Anyway, I think @xiaoyao1991's changes are good and wanted to suggest CLHM as an option if Druid needs to remain JDK7-based.

@drcrallen
Copy link
Contributor

Ok, so this foregoes the LinkedHashMap's remove eldest entry hook in order to ensure the cache size never exceeds the configured value in the put call itself. Or more precisely, it will evict until either there is enough room or the cache is empty.

Thread safety is retained through io.druid.client.cache.MapCache#clearLock , but the contract for LinkedHashMap already mandates external synchronization. So that's ok.

Looks good from here.

@drcrallen drcrallen added the Bug label Sep 27, 2016
@drcrallen drcrallen added this to the 0.9.3 milestone Sep 27, 2016
@drcrallen
Copy link
Contributor

👍

it's not letting me start and approve a review :-P

@drcrallen
Copy link
Contributor

Thanks for the contrib @xiaoyao1991 , would you mind filling out the Druid CLA? http://druid.io/community/cla.html

@fjy
Copy link
Contributor

fjy commented Sep 27, 2016

@drcrallen he should be covered already

@xiaoyao1991
Copy link
Member Author

@drcrallen yes, the idea is just like what you said. Thread safety is guaranteed through the synchronized map.

I've just filled out the CLA.

Need me to squash the commits?

@drcrallen
Copy link
Contributor

nope! github will squash them

@drcrallen drcrallen merged commit 91e6ab4 into apache:master Sep 28, 2016
fundead pushed a commit to fundead/druid that referenced this pull request Dec 7, 2016
* LRU cache guarantee to keep size under limit

* address comments

* fix failed tests in jdk7
dgolitsyn pushed a commit to metamx/druid that referenced this pull request Feb 14, 2017
* LRU cache guarantee to keep size under limit

* address comments

* fix failed tests in jdk7
seoeun25 pushed a commit to seoeun25/incubator-druid that referenced this pull request Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants