Hypothesis

2 Matching Annotations

Jun 2026
xcena.com xcena.com

Untitled document

1
1. fxp007 05 Jun 2026
  
  in Public
  
  the lack of KV sharing across requests leads to redundant prefill computation and wasted memory.
  
  KV sharing across concurrent requests is a non-obvious efficiency lever: if two users send similar prompts, their prefill KV states are computed independently. CXL's shared memory pool makes cross-request KV reuse architecturally possible for the first time without expensive GPU-to-GPU transfers.
  
  kv-sharing prefill multi-tenant-inference
Visit annotations in context

Tags

multi-tenant-inference

prefill

kv-sharing

Annotators

fxp007

URL

xcena.com/sdk_overview
Jun 2017
www.elastic.co www.elastic.co

Create Index | Elasticsearch Reference [5.4] | Elastic

1
1. ramlinuxprasad 02 Jun 2017
  
  in Public
  
  The create index API allows to instantiate an index. Elasticsearch provides support for multiple indices, including executing operations across several indices.
  
  In this you could create different shard size per Index basis on Elastic. Super useful when you have a single cluster but multi-tenant
  
  elastic shards multi tenant
Visit annotations in context

Tags

elastic

multi tenant

shards

Annotators

ramlinuxprasad

URL

elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html