SlideShare una empresa de Scribd logo
1 de 15
Descargar para leer sin conexión
SolrCloud and Shard Splitting
Shalin Shekhar Mangar
Bangalore Lucene/Solr Meetup
8th
June 2013
Who am I?
●
Apache Lucene/Solr Committer and PMC member
●
Contributor since January 2008
●
Currently: Engineer at LucidWorks
●
Formerly with AOL
●
Email: shalin@apache.org
●
Twitter: shalinmangar
●
Blog: http://shal.in
Bangalore Lucene/Solr Meetup
8th
June 2013
SolrCloud: Overview
●
Distributed searching/indexing
●
No single points of failure
●
Near Real Time Friendly (push replication)
●
Transaction logs for durability and recovery
●
Real-time get
●
Atomic Updates
●
Optimistic Concurrency
●
Request forwarding from any node in cluster
●
A strong contender for your NoSQL needs as well
Bangalore Lucene/Solr Meetup
8th
June 2013
Bangalore Lucene/Solr Meetup
8th
June 2013
Document Routing
80000000-bfffffff
00000000-3fffffff
40000000-7fffffff
c0000000-ffffffff
shard1shard4
shard3 shard2
1f27
3c7
1
(MurmurHash
3)
1f27
000
0
1f27 ffffto
(hash)
shard
1
q=my_query
shard.keys=BigCo!
numShards=4
router=compositeId
id = BigCo!doc5
Bangalore Lucene/Solr Meetup
8th
June 2013
SolrCloud Collections API
●
/admin/collections?action=CREATE&name=mycollection
– &numShards=3
– &replicationFactor=4
– &maxShardsPerNode=2
– &createNodeSet=node1:8080,node2:8080,node3:8080,...
– &collection.configName=myconfigset
●
/admin/collections?action=DELETE&name=mycollection
●
/admin/collections?action=RELOAD&name=mycollection
●
/admin/collections?action=CREATEALIAS&name=south
– &collections=KA,TN,AP,KL,...
●
Coming soon: Shard aliases
Bangalore Lucene/Solr Meetup
8th
June 2013
Shard Splitting: Background
●
Before Solr 4.3, number of shards had to fixed at the time
of collection creation
●
Forced people to start with large number of shards
●
If a shard ran too hot, the only fix was to re-index and
therefore re-balance the collection
●
Each shard is assigned a hash range
●
Each shard also has a state which defaults to 'ACTIVE'
Bangalore Lucene/Solr Meetup
8th
June 2013
Shard Splitting: Features
●
Seamless on-the-fly splitting – no downtime required
●
Retried on failures
●
/admin/collections?
action=SPLITSHARD&collection=mycollection
– &shard=shardId
●
A lower-level CoreAdmin API comes free!
– /admin/cores?action=SPLIT&core=core0&targetCore=core1&targetCore=core2
– /admin/cores?action=SPLIT&core=core0&path=/path/to/index/1&path=/path/to/index/2
Bangalore Lucene/Solr Meetup
8th
June 2013
Shard2_0
Shard1
replic
a
leade
r
Shard2
replic
a
leade
r
Shard3
replic
a
leade
r
Shard2_1
update
Shard Splitting
Bangalore Lucene/Solr Meetup
8th
June 2013
Shard Splitting: Mechanism
●
New sub-shards created in “construction” state
●
Leader starts forwarding applicable updates, which are buffered
by the sub-shards
●
Leader index is split and installed on the sub-shards
●
Sub-shards apply buffered updates
●
Replicas are created for sub-shards and brought up to speed
●
Sub-shard becomes “active” and old shard becomes “inactive”
Bangalore Lucene/Solr Meetup
8th
June 2013
Shard Splitting: Tips and Gotchas
●
Supports collections with a hash based router i.e. “plain”
or “compositeId” routers
●
Operation is executed by the Overseer node, not by the
node you requested
●
HTTP request is synchronous but operation is async. A
read timeout does not mean failure!
●
Operation is retried on failure. Check parent leader's logs
before you re-issue the command or you may end with
more shards than you want
Bangalore Lucene/Solr Meetup
8th
June 2013
Shard Splitting: Tips and gotchas
●
Solr Admin GUI is not aware of shard states yet so the
inactive parent shard is also shown in “green”
●
The CoreAdmin split command can be used against non-
cloud deployments. It will spread docs alternately among
the sub-indexes
●
Inactive shards have to be cleaned up manually. Solr 4.4
will have a delete shard API
●
Shard splitting in 4.3 release is buggy. Wait for 4.3.1
Bangalore Lucene/Solr Meetup
8th
June 2013
Shard Splitting: Looking towards the future
●
GUI integration and better progress reporting/monitoring
●
Better support for custom sharding use-cases
●
More flexibility towards number of sub-shards, hash
ranges, number of replicas etc
●
Store replication factor per shard
●
Suggest splits to admins based on cluster state and load
Confidential and Proprietary
© 2012 LucidWorks14
About LucidWorks
• Intro to LucidWorks (formerly Lucid Imagination)
– Follow: @lucidworks, @lucidimagineer
– Learn: http://www.lucidworks.com
• Check out SearchHub: http://www.searchhub.org
• Solr 4.1 Reference Guide: http://bit.ly/11KSiMN
– Older versions: http://bit.ly/12t1Egq
• Our Products
– LucidWorks Search
– LucidWorks Big Data
• Lucene Revolution
– http://www.lucenerevolution.com
Bangalore Lucene/Solr Meetup
8th
June 2013
Thank you
Shalin Shekhar Mangar
LucidWorks

Más contenido relacionado

La actualidad más candente

CSRF, ClickJacking & Open Redirect
CSRF, ClickJacking & Open RedirectCSRF, ClickJacking & Open Redirect
CSRF, ClickJacking & Open RedirectBlueinfy Solutions
 
Build and Manage Multi-Cloud Applications Using Kuma
Build and Manage Multi-Cloud Applications Using KumaBuild and Manage Multi-Cloud Applications Using Kuma
Build and Manage Multi-Cloud Applications Using KumaSven Bernhardt
 
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개OpenStack Korea Community
 
Host Header injection - Slides
Host Header injection - SlidesHost Header injection - Slides
Host Header injection - SlidesAmit Dubey
 
Docker Architecture (v1.3)
Docker Architecture (v1.3)Docker Architecture (v1.3)
Docker Architecture (v1.3)rajdeep
 
OSINT Tool - Reconnaissance with Recon-ng
OSINT Tool - Reconnaissance with Recon-ngOSINT Tool - Reconnaissance with Recon-ng
OSINT Tool - Reconnaissance with Recon-ngRaghav Bisht
 
Smart Sheriff, Dumb Idea, the wild west of government assisted parenting
Smart Sheriff, Dumb Idea, the wild west of government assisted parentingSmart Sheriff, Dumb Idea, the wild west of government assisted parenting
Smart Sheriff, Dumb Idea, the wild west of government assisted parentingAbraham Aranguren
 
[Solr 스터디] Solr 설정 및 색인 (2017)
[Solr 스터디] Solr 설정 및 색인 (2017)[Solr 스터디] Solr 설정 및 색인 (2017)
[Solr 스터디] Solr 설정 및 색인 (2017)용호 최
 
CSRF-уязвимости все еще актуальны: как атакующие обходят CSRF-защиту в вашем ...
CSRF-уязвимости все еще актуальны: как атакующие обходят CSRF-защиту в вашем ...CSRF-уязвимости все еще актуальны: как атакующие обходят CSRF-защиту в вашем ...
CSRF-уязвимости все еще актуальны: как атакующие обходят CSRF-защиту в вашем ...Mikhail Egorov
 
검색엔진이 데이터를 다루는 법 김종민
검색엔진이 데이터를 다루는 법 김종민검색엔진이 데이터를 다루는 법 김종민
검색엔진이 데이터를 다루는 법 김종민종민 김
 
Troopers 19 - I am AD FS and So Can You
Troopers 19 - I am AD FS and So Can YouTroopers 19 - I am AD FS and So Can You
Troopers 19 - I am AD FS and So Can YouDouglas Bienstock
 
OpenTelemetry For Developers
OpenTelemetry For DevelopersOpenTelemetry For Developers
OpenTelemetry For DevelopersKevin Brockhoff
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafkaconfluent
 
Learn O11y from Grafana ecosystem.
Learn O11y from Grafana ecosystem.Learn O11y from Grafana ecosystem.
Learn O11y from Grafana ecosystem.HungWei Chiu
 
Dll 분석 방법
Dll 분석 방법Dll 분석 방법
Dll 분석 방법상윤 유
 
[오픈소스컨설팅]Scouter 설치 및 사용가이드(JBoss)
[오픈소스컨설팅]Scouter 설치 및 사용가이드(JBoss)[오픈소스컨설팅]Scouter 설치 및 사용가이드(JBoss)
[오픈소스컨설팅]Scouter 설치 및 사용가이드(JBoss)Ji-Woong Choi
 
eBPF - Observability In Deep
eBPF - Observability In DeepeBPF - Observability In Deep
eBPF - Observability In DeepMydbops
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For OperatorsKevin Brockhoff
 

La actualidad más candente (20)

CSRF, ClickJacking & Open Redirect
CSRF, ClickJacking & Open RedirectCSRF, ClickJacking & Open Redirect
CSRF, ClickJacking & Open Redirect
 
Build and Manage Multi-Cloud Applications Using Kuma
Build and Manage Multi-Cloud Applications Using KumaBuild and Manage Multi-Cloud Applications Using Kuma
Build and Manage Multi-Cloud Applications Using Kuma
 
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
 
Host Header injection - Slides
Host Header injection - SlidesHost Header injection - Slides
Host Header injection - Slides
 
Docker Architecture (v1.3)
Docker Architecture (v1.3)Docker Architecture (v1.3)
Docker Architecture (v1.3)
 
Deep dive into ssrf
Deep dive into ssrfDeep dive into ssrf
Deep dive into ssrf
 
OSINT Tool - Reconnaissance with Recon-ng
OSINT Tool - Reconnaissance with Recon-ngOSINT Tool - Reconnaissance with Recon-ng
OSINT Tool - Reconnaissance with Recon-ng
 
Cron
CronCron
Cron
 
Smart Sheriff, Dumb Idea, the wild west of government assisted parenting
Smart Sheriff, Dumb Idea, the wild west of government assisted parentingSmart Sheriff, Dumb Idea, the wild west of government assisted parenting
Smart Sheriff, Dumb Idea, the wild west of government assisted parenting
 
[Solr 스터디] Solr 설정 및 색인 (2017)
[Solr 스터디] Solr 설정 및 색인 (2017)[Solr 스터디] Solr 설정 및 색인 (2017)
[Solr 스터디] Solr 설정 및 색인 (2017)
 
CSRF-уязвимости все еще актуальны: как атакующие обходят CSRF-защиту в вашем ...
CSRF-уязвимости все еще актуальны: как атакующие обходят CSRF-защиту в вашем ...CSRF-уязвимости все еще актуальны: как атакующие обходят CSRF-защиту в вашем ...
CSRF-уязвимости все еще актуальны: как атакующие обходят CSRF-защиту в вашем ...
 
검색엔진이 데이터를 다루는 법 김종민
검색엔진이 데이터를 다루는 법 김종민검색엔진이 데이터를 다루는 법 김종민
검색엔진이 데이터를 다루는 법 김종민
 
Troopers 19 - I am AD FS and So Can You
Troopers 19 - I am AD FS and So Can YouTroopers 19 - I am AD FS and So Can You
Troopers 19 - I am AD FS and So Can You
 
OpenTelemetry For Developers
OpenTelemetry For DevelopersOpenTelemetry For Developers
OpenTelemetry For Developers
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
 
Learn O11y from Grafana ecosystem.
Learn O11y from Grafana ecosystem.Learn O11y from Grafana ecosystem.
Learn O11y from Grafana ecosystem.
 
Dll 분석 방법
Dll 분석 방법Dll 분석 방법
Dll 분석 방법
 
[오픈소스컨설팅]Scouter 설치 및 사용가이드(JBoss)
[오픈소스컨설팅]Scouter 설치 및 사용가이드(JBoss)[오픈소스컨설팅]Scouter 설치 및 사용가이드(JBoss)
[오픈소스컨설팅]Scouter 설치 및 사용가이드(JBoss)
 
eBPF - Observability In Deep
eBPF - Observability In DeepeBPF - Observability In Deep
eBPF - Observability In Deep
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
 

Destacado

Solrcloud Leader Election
Solrcloud Leader ElectionSolrcloud Leader Election
Solrcloud Leader Electionravikgiitk
 
Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4thelabdude
 
Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudthelabdude
 
SolrCloud Failover and Testing
SolrCloud Failover and TestingSolrCloud Failover and Testing
SolrCloud Failover and TestingMark Miller
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloudVarun Thacker
 
Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Shalin Shekhar Mangar
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Shalin Shekhar Mangar
 
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupInside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupShalin Shekhar Mangar
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksShalin Shekhar Mangar
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataShalin Shekhar Mangar
 
Introduction to apache zoo keeper
Introduction to apache zoo keeper Introduction to apache zoo keeper
Introduction to apache zoo keeper Omid Vahdaty
 

Destacado (20)

Solrcloud Leader Election
Solrcloud Leader ElectionSolrcloud Leader Election
Solrcloud Leader Election
 
Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4
 
Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloud
 
Scaling Solr with Solr Cloud
Scaling Solr with Solr CloudScaling Solr with Solr Cloud
Scaling Solr with Solr Cloud
 
SolrCloud Failover and Testing
SolrCloud Failover and TestingSolrCloud Failover and Testing
SolrCloud Failover and Testing
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
 
Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6
 
High Performance Solr
High Performance SolrHigh Performance Solr
High Performance Solr
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
 
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupInside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene Meetup
 
Intro to Apache Solr
Intro to Apache SolrIntro to Apache Solr
Intro to Apache Solr
 
Scaling search with SolrCloud
Scaling search with SolrCloudScaling search with SolrCloud
Scaling search with SolrCloud
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networks
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big Data
 
Introduction to apache zoo keeper
Introduction to apache zoo keeper Introduction to apache zoo keeper
Introduction to apache zoo keeper
 
Search engine ppt
Search engine pptSearch engine ppt
Search engine ppt
 

Similar a SolrCloud and Shard Splitting

Sequential Concurrency ... WHAT ???
Sequential Concurrency ... WHAT ???Sequential Concurrency ... WHAT ???
Sequential Concurrency ... WHAT ???Jitendra Chittoda
 
Apache Solr for TYPO3 CMS 101
Apache Solr for TYPO3 CMS 101Apache Solr for TYPO3 CMS 101
Apache Solr for TYPO3 CMS 101Olivier Dobberkau
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101Itiel Shwartz
 
apachecamelk-april2019-190409093034.pdf
apachecamelk-april2019-190409093034.pdfapachecamelk-april2019-190409093034.pdf
apachecamelk-april2019-190409093034.pdfssuserbb9f511
 
PostgreSQL Finland October meetup - PostgreSQL monitoring in Zalando
PostgreSQL Finland October meetup - PostgreSQL monitoring in ZalandoPostgreSQL Finland October meetup - PostgreSQL monitoring in Zalando
PostgreSQL Finland October meetup - PostgreSQL monitoring in ZalandoUri Savelchev
 
Oracle ADF Architecture TV - Design - Task Flow Navigation Options
Oracle ADF Architecture TV - Design - Task Flow Navigation OptionsOracle ADF Architecture TV - Design - Task Flow Navigation Options
Oracle ADF Architecture TV - Design - Task Flow Navigation OptionsChris Muir
 
BlackRay FOSS Asia 2010
BlackRay FOSS Asia 2010BlackRay FOSS Asia 2010
BlackRay FOSS Asia 2010fschupp
 
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013Andrew Morgan
 
Apache Camel K - Copenhagen v2
Apache Camel K - Copenhagen v2Apache Camel K - Copenhagen v2
Apache Camel K - Copenhagen v2Claus Ibsen
 
Apache Camel K - Copenhagen
Apache Camel K - CopenhagenApache Camel K - Copenhagen
Apache Camel K - CopenhagenClaus Ibsen
 
Akka Clustering And Sharding
Akka Clustering And ShardingAkka Clustering And Sharding
Akka Clustering And ShardingKnoldus Inc.
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0Anshum Gupta
 
Rails 3 : Cool New Things
Rails 3 : Cool New ThingsRails 3 : Cool New Things
Rails 3 : Cool New ThingsY. Thong Kuah
 
Why we love pgpool-II and why we hate it!
Why we love pgpool-II and why we hate it!Why we love pgpool-II and why we hate it!
Why we love pgpool-II and why we hate it!PGConf APAC
 
MySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdfMySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdfYunusShaikh49
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5israelekpo
 

Similar a SolrCloud and Shard Splitting (20)

Sequential Concurrency ... WHAT ???
Sequential Concurrency ... WHAT ???Sequential Concurrency ... WHAT ???
Sequential Concurrency ... WHAT ???
 
ForkJoinPools and parallel streams
ForkJoinPools and parallel streamsForkJoinPools and parallel streams
ForkJoinPools and parallel streams
 
Apache Solr for TYPO3 CMS 101
Apache Solr for TYPO3 CMS 101Apache Solr for TYPO3 CMS 101
Apache Solr for TYPO3 CMS 101
 
Intro to openfaas
Intro to openfaasIntro to openfaas
Intro to openfaas
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
 
Distributed Tracing
Distributed TracingDistributed Tracing
Distributed Tracing
 
apachecamelk-april2019-190409093034.pdf
apachecamelk-april2019-190409093034.pdfapachecamelk-april2019-190409093034.pdf
apachecamelk-april2019-190409093034.pdf
 
PostgreSQL Finland October meetup - PostgreSQL monitoring in Zalando
PostgreSQL Finland October meetup - PostgreSQL monitoring in ZalandoPostgreSQL Finland October meetup - PostgreSQL monitoring in Zalando
PostgreSQL Finland October meetup - PostgreSQL monitoring in Zalando
 
Oracle ADF Architecture TV - Design - Task Flow Navigation Options
Oracle ADF Architecture TV - Design - Task Flow Navigation OptionsOracle ADF Architecture TV - Design - Task Flow Navigation Options
Oracle ADF Architecture TV - Design - Task Flow Navigation Options
 
BlackRay FOSS Asia 2010
BlackRay FOSS Asia 2010BlackRay FOSS Asia 2010
BlackRay FOSS Asia 2010
 
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013
 
Advance Features of Hibernate
Advance Features of HibernateAdvance Features of Hibernate
Advance Features of Hibernate
 
Apache Camel K - Copenhagen v2
Apache Camel K - Copenhagen v2Apache Camel K - Copenhagen v2
Apache Camel K - Copenhagen v2
 
Apache Camel K - Copenhagen
Apache Camel K - CopenhagenApache Camel K - Copenhagen
Apache Camel K - Copenhagen
 
Akka Clustering And Sharding
Akka Clustering And ShardingAkka Clustering And Sharding
Akka Clustering And Sharding
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0
 
Rails 3 : Cool New Things
Rails 3 : Cool New ThingsRails 3 : Cool New Things
Rails 3 : Cool New Things
 
Why we love pgpool-II and why we hate it!
Why we love pgpool-II and why we hate it!Why we love pgpool-II and why we hate it!
Why we love pgpool-II and why we hate it!
 
MySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdfMySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdf
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5
 

Último

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Último (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

SolrCloud and Shard Splitting

  • 1. SolrCloud and Shard Splitting Shalin Shekhar Mangar
  • 2. Bangalore Lucene/Solr Meetup 8th June 2013 Who am I? ● Apache Lucene/Solr Committer and PMC member ● Contributor since January 2008 ● Currently: Engineer at LucidWorks ● Formerly with AOL ● Email: shalin@apache.org ● Twitter: shalinmangar ● Blog: http://shal.in
  • 3. Bangalore Lucene/Solr Meetup 8th June 2013 SolrCloud: Overview ● Distributed searching/indexing ● No single points of failure ● Near Real Time Friendly (push replication) ● Transaction logs for durability and recovery ● Real-time get ● Atomic Updates ● Optimistic Concurrency ● Request forwarding from any node in cluster ● A strong contender for your NoSQL needs as well
  • 5. Bangalore Lucene/Solr Meetup 8th June 2013 Document Routing 80000000-bfffffff 00000000-3fffffff 40000000-7fffffff c0000000-ffffffff shard1shard4 shard3 shard2 1f27 3c7 1 (MurmurHash 3) 1f27 000 0 1f27 ffffto (hash) shard 1 q=my_query shard.keys=BigCo! numShards=4 router=compositeId id = BigCo!doc5
  • 6. Bangalore Lucene/Solr Meetup 8th June 2013 SolrCloud Collections API ● /admin/collections?action=CREATE&name=mycollection – &numShards=3 – &replicationFactor=4 – &maxShardsPerNode=2 – &createNodeSet=node1:8080,node2:8080,node3:8080,... – &collection.configName=myconfigset ● /admin/collections?action=DELETE&name=mycollection ● /admin/collections?action=RELOAD&name=mycollection ● /admin/collections?action=CREATEALIAS&name=south – &collections=KA,TN,AP,KL,... ● Coming soon: Shard aliases
  • 7. Bangalore Lucene/Solr Meetup 8th June 2013 Shard Splitting: Background ● Before Solr 4.3, number of shards had to fixed at the time of collection creation ● Forced people to start with large number of shards ● If a shard ran too hot, the only fix was to re-index and therefore re-balance the collection ● Each shard is assigned a hash range ● Each shard also has a state which defaults to 'ACTIVE'
  • 8. Bangalore Lucene/Solr Meetup 8th June 2013 Shard Splitting: Features ● Seamless on-the-fly splitting – no downtime required ● Retried on failures ● /admin/collections? action=SPLITSHARD&collection=mycollection – &shard=shardId ● A lower-level CoreAdmin API comes free! – /admin/cores?action=SPLIT&core=core0&targetCore=core1&targetCore=core2 – /admin/cores?action=SPLIT&core=core0&path=/path/to/index/1&path=/path/to/index/2
  • 9. Bangalore Lucene/Solr Meetup 8th June 2013 Shard2_0 Shard1 replic a leade r Shard2 replic a leade r Shard3 replic a leade r Shard2_1 update Shard Splitting
  • 10. Bangalore Lucene/Solr Meetup 8th June 2013 Shard Splitting: Mechanism ● New sub-shards created in “construction” state ● Leader starts forwarding applicable updates, which are buffered by the sub-shards ● Leader index is split and installed on the sub-shards ● Sub-shards apply buffered updates ● Replicas are created for sub-shards and brought up to speed ● Sub-shard becomes “active” and old shard becomes “inactive”
  • 11. Bangalore Lucene/Solr Meetup 8th June 2013 Shard Splitting: Tips and Gotchas ● Supports collections with a hash based router i.e. “plain” or “compositeId” routers ● Operation is executed by the Overseer node, not by the node you requested ● HTTP request is synchronous but operation is async. A read timeout does not mean failure! ● Operation is retried on failure. Check parent leader's logs before you re-issue the command or you may end with more shards than you want
  • 12. Bangalore Lucene/Solr Meetup 8th June 2013 Shard Splitting: Tips and gotchas ● Solr Admin GUI is not aware of shard states yet so the inactive parent shard is also shown in “green” ● The CoreAdmin split command can be used against non- cloud deployments. It will spread docs alternately among the sub-indexes ● Inactive shards have to be cleaned up manually. Solr 4.4 will have a delete shard API ● Shard splitting in 4.3 release is buggy. Wait for 4.3.1
  • 13. Bangalore Lucene/Solr Meetup 8th June 2013 Shard Splitting: Looking towards the future ● GUI integration and better progress reporting/monitoring ● Better support for custom sharding use-cases ● More flexibility towards number of sub-shards, hash ranges, number of replicas etc ● Store replication factor per shard ● Suggest splits to admins based on cluster state and load
  • 14. Confidential and Proprietary © 2012 LucidWorks14 About LucidWorks • Intro to LucidWorks (formerly Lucid Imagination) – Follow: @lucidworks, @lucidimagineer – Learn: http://www.lucidworks.com • Check out SearchHub: http://www.searchhub.org • Solr 4.1 Reference Guide: http://bit.ly/11KSiMN – Older versions: http://bit.ly/12t1Egq • Our Products – LucidWorks Search – LucidWorks Big Data • Lucene Revolution – http://www.lucenerevolution.com
  • 15. Bangalore Lucene/Solr Meetup 8th June 2013 Thank you Shalin Shekhar Mangar LucidWorks