{"_id":"56e9d31ddd327c0e000ef1b6","project":"5633ebff7e9e880d00af1a53","user":"5633ec9b35355017003ca3f2","category":{"_id":"567b0005802b2b17005ddea3","pages":["567b001117368a0d009a6e10","567b00307c40060d005603e7","567b039a7c40060d005603ec"],"project":"5633ebff7e9e880d00af1a53","version":"5633ec007e9e880d00af1a56","__v":3,"sync":{"url":"","isSync":false},"reference":false,"createdAt":"2015-12-23T20:11:49.377Z","from_sync":false,"order":2,"slug":"best-practices","title":"Best Practices & Tools"},"version":{"_id":"5633ec007e9e880d00af1a56","project":"5633ebff7e9e880d00af1a53","__v":16,"createdAt":"2015-10-30T22:15:28.105Z","releaseDate":"2015-10-30T22:15:28.105Z","categories":["5633ec007e9e880d00af1a57","5633f072737ea01700ea329d","5637a37d0704070d00f06cf4","5637cf4e7ca5de0d00286aeb","564503082c74cf1900da48b4","564503cb7f1fff210078e70a","567af26cb56bac0d0019d87d","567afeb8802b2b17005ddea0","567aff47802b2b17005ddea1","567b0005802b2b17005ddea3","568adfffcbd4ca0d00aebf7e","56ba80078cf7c9210009673e","574d127f6f075519007da3d0","574fde60aef76a0e00840927","57a22ba6cd51b22d00f623a0","5a062c15a66ae1001a3f5b09"],"is_deprecated":false,"is_hidden":false,"is_beta":false,"is_stable":true,"codename":"","version_clean":"1.0.0","version":"1.0"},"__v":5,"parentDoc":null,"updates":[],"next":{"pages":[],"description":""},"createdAt":"2016-03-16T21:41:49.159Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":1,"body":"For people new to Elasticsearch, shard creation can be a bit of a mystery. Shards are not created automatically, and can sometimes be added or removed through the Elasticsearch API. If you find yourself dealing with too many shards, the first step to reducing them is to figure out where they're coming from.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Where do shards come from?\"\n}\n[/block]\nA little bit of background: whenever you create an index on a cluster, that index will be composed of shards. A shard is a [Lucene index](http://lucene.apache.org/), and the main component responsible for storing and retrieving documents. Shards play one of two roles: primary or replica. Primary shards are a logical partitioning of the data in the index and are fixed at the time that the index is created. Replica shards are extra copies used for redundancy or to handle extra search traffic, and can be added and removed on demand.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/qNFuyoeARFmlzsQfu4Ox_reduce-shards09.jpg\",\n        \"reduce-shards09.jpg\",\n        \"2732\",\n        \"2048\",\n        \"#428fd1\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\nYou can specify how many primary shards and replicas are used when creating a new index.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"PUT /my_index/_settings\\n{\\n  \\\"number_of_replicas\\\": 2\\n}\",\n      \"language\": \"json\"\n    }\n  ]\n}\n[/block]\n Replicas are a multiplier on the primary shards, and the total is calculated as primary * (1+replicas). In other words, if you create an index with 3 primary shards and 2 replicas, you will have 9 total shards, not 5 or 6.\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"title\": \"Replicas and High-Availability\",\n  \"body\": \"By default, all new indices are created with a single replica, which is the only setting we officially support for High-Availability for production clusters. Sometimes, users with production clusters will set some or all of their indices to have 0 replicas. Although we currently allow this, we highly discourage it for production indexes.  Any index with replication turned off is not in a High-Availability configuration, and in the event of a data loss incident, we do not support backups or restoration from backups for any unreplicated indices.\"\n}\n[/block]\n\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/f5yajUc0QmaFq8b1gnYr_reduce-shards07.png\",\n        \"reduce-shards07.png\",\n        \"2400\",\n        \"1324\",\n        \"#3c8ccc\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\n\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Measuring your cluster’s index and shard usage\"\n}\n[/block]\nElasticsearch offers some API endpoints to explore the state of your indices and shards. The `_cat` APIs are helpful for human interaction. You can view your index states by visiting `/_cat/indices`,  which will show index names, primary shards and replicas. You can also inspect individual shard states and statistics by visiting `/_cat/shards`. See example output below:\n\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"\\n$ curl -s https://user:password:::at:::bonsai-12345.bonsai.io/_cat/indices?v\\nhealth status index  pri rep docs.count docs.deleted store.size pri.store.size \\ngreen  open   images   1   0          0            0       130b           130b \\ngreen  open   videos   1   0          0            0       130b           130b \\ngreen  open   notes    1   0          0            0       130b           130b \\n\\n$ curl -s https://user:password@bonsai-12345.bonsai.io/_cat/shards?v\\nindex  shard pri rep state   docs store ip              node      \\nimages 0     p       STARTED    0  130b XXX.XXX.XXX.XXX Sugar Man \\nnotes  0     p       STARTED    0  130b XXX.XXX.XXX.XXX Sugar Man \\nvideos 0     p       STARTED    0  130b XXX.XXX.XXX.XXX Sugar Man \",\n      \"language\": \"curl\"\n    }\n  ]\n}\n[/block]\nWe've also made this easy by creating a live interactive console for you. Just visit your cluster's dashboard console, chose `GET` from the dropdown, and run `/_cat/indices?v` or `/_cat/shards?v`:\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/fa2ToVHYRK6r56Kc8bOn_console_view.png\",\n        \"console_view.png\",\n        \"2000\",\n        \"1008\",\n        \"#072e35\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]","excerpt":"","slug":"what-are-shards-and-replicas","type":"basic","title":"Shards and Replicas"}

Shards and Replicas


For people new to Elasticsearch, shard creation can be a bit of a mystery. Shards are not created automatically, and can sometimes be added or removed through the Elasticsearch API. If you find yourself dealing with too many shards, the first step to reducing them is to figure out where they're coming from. [block:api-header] { "type": "basic", "title": "Where do shards come from?" } [/block] A little bit of background: whenever you create an index on a cluster, that index will be composed of shards. A shard is a [Lucene index](http://lucene.apache.org/), and the main component responsible for storing and retrieving documents. Shards play one of two roles: primary or replica. Primary shards are a logical partitioning of the data in the index and are fixed at the time that the index is created. Replica shards are extra copies used for redundancy or to handle extra search traffic, and can be added and removed on demand. [block:image] { "images": [ { "image": [ "https://files.readme.io/qNFuyoeARFmlzsQfu4Ox_reduce-shards09.jpg", "reduce-shards09.jpg", "2732", "2048", "#428fd1", "" ] } ] } [/block] You can specify how many primary shards and replicas are used when creating a new index. [block:code] { "codes": [ { "code": "PUT /my_index/_settings\n{\n \"number_of_replicas\": 2\n}", "language": "json" } ] } [/block] Replicas are a multiplier on the primary shards, and the total is calculated as primary * (1+replicas). In other words, if you create an index with 3 primary shards and 2 replicas, you will have 9 total shards, not 5 or 6. [block:callout] { "type": "warning", "title": "Replicas and High-Availability", "body": "By default, all new indices are created with a single replica, which is the only setting we officially support for High-Availability for production clusters. Sometimes, users with production clusters will set some or all of their indices to have 0 replicas. Although we currently allow this, we highly discourage it for production indexes. Any index with replication turned off is not in a High-Availability configuration, and in the event of a data loss incident, we do not support backups or restoration from backups for any unreplicated indices." } [/block] [block:image] { "images": [ { "image": [ "https://files.readme.io/f5yajUc0QmaFq8b1gnYr_reduce-shards07.png", "reduce-shards07.png", "2400", "1324", "#3c8ccc", "" ] } ] } [/block] [block:api-header] { "type": "basic", "title": "Measuring your cluster’s index and shard usage" } [/block] Elasticsearch offers some API endpoints to explore the state of your indices and shards. The `_cat` APIs are helpful for human interaction. You can view your index states by visiting `/_cat/indices`, which will show index names, primary shards and replicas. You can also inspect individual shard states and statistics by visiting `/_cat/shards`. See example output below: [block:code] { "codes": [ { "code": "\n$ curl -s https://user:password@bonsai-12345.bonsai.io/_cat/indices?v\nhealth status index pri rep docs.count docs.deleted store.size pri.store.size \ngreen open images 1 0 0 0 130b 130b \ngreen open videos 1 0 0 0 130b 130b \ngreen open notes 1 0 0 0 130b 130b \n\n$ curl -s https://user:password@bonsai-12345.bonsai.io/_cat/shards?v\nindex shard pri rep state docs store ip node \nimages 0 p STARTED 0 130b XXX.XXX.XXX.XXX Sugar Man \nnotes 0 p STARTED 0 130b XXX.XXX.XXX.XXX Sugar Man \nvideos 0 p STARTED 0 130b XXX.XXX.XXX.XXX Sugar Man ", "language": "curl" } ] } [/block] We've also made this easy by creating a live interactive console for you. Just visit your cluster's dashboard console, chose `GET` from the dropdown, and run `/_cat/indices?v` or `/_cat/shards?v`: [block:image] { "images": [ { "image": [ "https://files.readme.io/fa2ToVHYRK6r56Kc8bOn_console_view.png", "console_view.png", "2000", "1008", "#072e35", "" ] } ] } [/block]