Categories

Connection Issues

This article aims to reduce the friction of resolving connection problems by offering suggestions for quickly identifying a root cause.
Last updated
July 7, 2023

Troubleshooting and resolving connection issues can be time consuming and frustrating. This article aims to reduce the friction of resolving connection problems by offering suggestions for quickly identifying a root cause.

First Steps

If you're seeing connection errors while attempting to reach your Bonsai cluster, the first step is Don't Panic. Virtually all connection issues can be fixed in under 5 minutes once the root cause is identified.

The next step is to review the documentation on Connecting to Bonsai, which contains plenty of information about creating connections and testing the availability of your Bonsai cluster.

If the basic documentation doesn't help resolve the problem, the next step is to read the error message very carefully. Often the error message contains enough information to explain the problem. For example, something like this may show up in your logs:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">Faraday::ConnectionFailed (Connection refused - connect(2) for "localhost" port 9200):</code></pre>
</div>

This message tells us that the client tried to reach Elasticsearch over localhost, which is a red flag (discussed below). It means the client is not pointed at your Bonsai cluster.

Next, look for an HTTP error code. Bonsai provides an entire article dedicated to explaining what different HTTP codes mean here: HTTP Error Codes. Like the error message, the HTTP error code often provides enough diagnostic information to identify the underlying problem.

Last, don't make any changes until you understand the error message and HTTP code (if one is present), and have read the relevant documentation. Most of these errors occur during the initial set up and configuration step, so also read the relevant Quickstart guide if one exists for your language / framework.

Common Connection Issues

There are issues which do not return HTTP error codes because the request can not be made at all, or because the request is not recognized by Elasticsearch. These issues are commonly one of the following:

  • Connection Refused
  • No handler found for URI
  • Name or service not known
  • HTTP 401 / Missing Authentication Headers
  • Missing SSL Root Certificate

Connection Refused

This error indicates that a request was made to a server that is not accepting connections.

This error most commonly occurs when your Elasticsearch client has not been properly configured. By default, many clients will try to connect to something like `localhost:9200`. This is a problem because Bonsai will never be running on your localhost or network, regardless of your platform.

What is Localhost?

Simply, localhost can be interpreted as "this computer." Which computer is "this" computer is determined by whatever machine is running the code. If you're running the code on your laptop, then localhost is the machine in front of you; if you're running the code on AWS or in Heroku, then the localhost is the node running your application.

Bonsai clusters run in a private network that does not include any of your infrastructure, which is why trying to reach it via localhost will always fail.

Why Port 9200?

By default, Elasticsearch runs on port 9200. While you can access your Bonsai cluster over port 9200, this is not recommended due to lack of encryption in transit.

All users will all need to ensure their Elasticsearch client is pointed at the correct URL and ensure that the Elasticsearch client is properly configured.

Ruby / Rails

If you’re using the elastisearch-rails client, simply add the following gem to your Gemfile:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://assets-global.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">gem 'bonsai-elasticsearch-rails'</code></pre>
</div>
</div>

The bonsai-elasticsearch-rails gem is a shim that configures your Elasticsearch client to load the cluster URL from an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span>. You can read more about it on the project repository.

If you'd prefer to keep your Gemfile sparse, you can initialize the client yourself like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://assets-global.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript"># config/initializers/elasticsearch.rb
Elasticsearch::Model.client = Elasticsearch::Client.new url: ENV['BONSAI_URL']</code></pre>
</div>
</div>

If you opt for this method, make sure to add the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> to your environment. It will be automatically created for Heroku users. Users managing their own application environment will need to run something like:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://assets-global.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">$ export BONSAI_URL="https://randomuser:randompass@something-12345.us-east-1.bonaisearch.net"</code></pre>
</div>
</div>

Other Languages / Frameworks

The Elasticsearch client is probably using the default <span class="inline-code"><pre><code>localhost:9200</code></pre></span> or <span class="inline-code"><pre><code>127.0.0.1:9200</code></pre></span> (127.0.0.1 is the IPv4 equivalent of "localhost"). You'll need to make sure that the client is configured to use the correct URL for your cluster, and that this configuration is not being overwritten somewhere.

No handler found for URI

The Elasticsearch API has a variety of endpoints defined, like <span class="inline-code"><pre><code>/_cat/indices</code></pre></span>. Each of these endpoints can be called with a specific HTTP method. This error simply indicates a request was made to an endpoint using the wrong method.

Here are some examples:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://assets-global.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">
 # Request to /_cat/indices with GET method:
 $ curl -XGET https://user:pass@something-12345.us-east-1.bonsai.io/_cat/indices
 green open test1           1 1 0 0  318b  159b
 green open test2           1 1 0 0  318b  159b

 # Request to /_cat/indices with PUT method:
 $ curl -XPUT https://user:pass@something-12345.us-east-1.bonsai.io/_cat/indices
 No handler found for uri [/_cat/indices] and method [PUT]</code></pre>
</div>
</div>

The solution for this issue is simple: use the correct HTTP method. The Elasticsearch documentation will offer guidance on which methods pertain to the endpoint you're trying to use.

Name or service not known

The Domain Name System (DNS) is a worldwide, decentralized and distributed directory service which translates human-readable domains like www.example.com in to network addresses like 93.184.216.119. When a client makes a request to a URL like "google.com," the application's networking layer will use DNS to translate the domain to an IP address so that it can pass the request along to the right server.

The "Name or service not known" error indicates that there has been a failure in determining an IP address for the URL's domain. This typically implicates one of several root causes:

  1. There is a typo in the URL.
  2. There is a TLD outage.
  3. The Internet Service Provider (ISP) handling your application's traffic may have lost a DNS server, or is having another outage.

The first troubleshooting step is to carefully read the error and double-check that the URL (particularly the domain name) is spelled correctly. If this is your first time accessing the cluster, then a typo is almost certainly the problem.

If this error arises during regular production use, then there is probably a DNS outage. DNS outages are outside of Bonsai's control. There are a couple types of outages, but the ones that have affected our users before are:

TLD outage

A TLD is something like ".com," ".net," ".org," etc. A TLD outage affects all domains under the TLD. So an outage of the ".net" TLD would affect all domains with the ".net" suffix.

Fortunately, all Bonsai clusters can be accessed via either of two domains:

  • bonsai.io
  • bonsaisearch.net

If there is a TLD outage, you should be able to restore service by switching to the other domain. In other words, if your application is sending traffic to <span class="inline-code"><pre><code>https://user:pass@something-12345.us-east-1.bonsai.io</code></pre></span>, and the ".io" TLD goes down, you can switch over to <span class="inline-code"><pre><code>https://user:pass@something-12345.us-east-1.bonsaisearch.net</code></pre></span>, and it will fix the error.

ISP Outage

Many ISPs operate their own DNS servers. This way, requests made from a node in their network can get low-latency responses for IP addresses. Most ISPs also have a fleet of DNS servers, at minimum a primary and a secondary. However, this is not a requirement, and there have been instances where an ISP's entire DNS service is taken offline.

There are also multitenant services which have Software Defined Networks (SDN) running Internal DNS (iDNS). Regressions in this software can also lead to application-level DNS name resolution problems.

If you've already confirmed the domain is correct and swapped to another TLD for your cluster, and you're still having issues, then you are probably dealing with an ISP/DNS or SDN/iDNS outage. One way to confirm this is to try making requests to other common domains like google.com. A name resolution error on google.com or example.com points to a local DNS problem.

If this happens, there is basically nothing you can do about it as a user, aside from complaining to your application sysadmin.

Missing Authentication Headers

Users who are seeing persistent HTTP 401 Error Codes may be using a client that is not handling authentication properly. As explained in the error code documentation, as well as in Connecting to Bonsai, all Bonsai clusters have a randomly-generated username and password which must be present in the request in order for it to be accepted.

What's less clear from this documentation is that including the complete URL in your Elasticsearch client may not be enough to create a secure connection.

This is due to how HTTP Basic Access Authentication works. In short, there needs to be a request header present. This header has an "Authorization" field and a value of <span class="inline-code"><pre><code>Basic</code></pre></span> . The base64 string is a base64 representation of the username and password, concatenated with a colon (":").

Most clients handle the headers for you automatically and in the background. But not all do, especially if the client is part of a bleeding edge language or framework, or if it's something homebrewed/forked/patched.

Here is a basic example in Ruby using <span class="inline-code"><pre><code>Net::HTTP</code></pre></span>, demonstrating how a URL with auth can still receive an HTTP 401 response:

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://assets-global.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">require 'base64'
require 'net/http'

# URL with credentials:
uri = URI("https://randomuser:randompass@something-12345.us-east-1.bonsaisearch.net")

# Net::HTTP does not automatically detect the presence of
# authentication credentials and insert the proper Authorization header.req = Net::HTTP::Get.new(uri)

# This request will fail with an HTTP 401, even though the credentials
# are in the URI:
res = Net::HTTP.start(uri.hostname, uri.port, :use_ssl => true) {|http|
 http.request(req)}

# The proper header must be added manually:
credentials = "randomuser:randompass"
req['Authorization'] = "Basic " + Base64::encode64(credentials).chomp

# The request now succeeds
res = Net::HTTP.start(uri.hostname, uri.port, :use_ssl => true) {|http|
 http.request(req)
}</code></pre>
</div>
</div>

From the Ruby example, it's clear that there are some cases where the credentials are simply ignored instead of being automatically put in to a header. This causes the Basic authentication to fail, and receive the HTTP 401.

Simply, if you're seeing HTTP 401 responses even while including the credentials in the URL, and you've confirmed that the credentials are entered correctly and have not been expired, then the problem is probably a missing header. You can detect this with tools like socat or wireshark if you're familiar with network traffic inspection. Or, you can try adding the headers manually.

Here are some examples of calculating the base64 string and adding the request header in several different languages:

Android

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://assets-global.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">  public static Map getHeaders() {
 Map headers = new HashMap<>();
 String credentials = "randomuser:randompass";
 String auth = "Basic " + Base64.encodeToString(credentials.getBytes(), Base64.NO_WRAP);
 headers.put("Authorization", auth);
 headers.put("Content-type", "application/json");
 return headers;
}</code></pre>
</div>
</div>

Java

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://assets-global.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-java">public static Map getHeaders() {
 Map headers = new HashMap<>();
 String credentials = "randomuser:randompass";
 String auth = "Basic " + Base64.getEncoder().encodeToString(credentials.getBytes());
 headers.put("Authorization", auth);
 headers.put("Content-type", "application/json");
 return headers;
}</code></pre>
</div>
</div>

Ruby

<div class="code-snippet-container">
<a fs-copyclip-element="click-9" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://assets-global.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-9" class="hljs language-ruby">require 'base64'
require 'net/http'

uri = URI("https://something-12345.us-east-1.bonsaisearch.net")
req = Net::HTTP::Get.new(uri)
credentials = "randomuser:randompass"
req['Authorization'] = "Basic " + Base64::encode64(credentials).chomp
res = Net::HTTP.start(uri.hostname, uri.port, :use_ssl => true) {|http|
 http.request(req)
}</code></pre>
</div>
</div>

Missing SSL Root Certificate

As mentioned in the Security documentation, all Bonsai clusters support SSL/TLS. This enables your traffic to be encrypted over the wire. In some rare cases, users may see something like this when trying to access their cluster over HTTPS:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">SSL: CERTIFICATE_VERIFY_FAILED</code></pre>
</div>

This is almost certainly due to a server level misconfiguration.

A complete examination of how SSL works is well outside the scope of this article, but in short: it utilizes cryptographic signatures to facilitate a chain of trust from a root certificate issued by a certificate authority (CA) to a certificate deployed to a server. The latter is used to prove the server's ownership before initiating public key exchange.

Think of a certificate authority as a mediator of sorts. A company like Bonsai goes to the CA and provides proof of identity and ownership of a domain. The CA issues Bonsai a unique certificate cryptographically signed by the CA's root certificate. Then Bonsai configures its servers to respond to SSL/TLS requests using the certificates signed by the CA's root.

When your application reaches out to Bonsai over HTTPS, the Bonsai server will respond with this certificate. The application can then inspect the certificate it receives from Bonsai and determine whether the CA's cryptographic signature is valid. If it's valid, then the application knows it really is talking to Bonsai and not an impersonator. It then performs a key exchange to open an encrypted connection and then sends/receives data.

One weakness of this system is that the CA has to be recognized by both parties; if your application doesn't have a copy of the CA's root certificate, it can't validate that the server it's talking to is genuine. You may start seeing errors like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">Could not verify the SSL certificate for...
SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failedpeer certificate won't be verified in this SSL session
SSL certificate problem, verify that the CA cert is OK
SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed</code></pre>
</div>

These are just a sample of the errors you might see, but it all points to the application's inability to verify Bonsai's certificate.

It's possible that this is caused by an active man in the middle (MITM) attack, but the greater probability is that the application does not have access to the right certificates. There are a few things to check:

1. Make sure the CA root certificate for Bonsai is available

Bonsai's nodes are using Amazon Trust Services as a certificate authority. These CA's need to be added to your trust store. They should be automatically installed in most browsers / operating systems. If you're seeing SSL issues, make sure that you have the correct root certificates in your trust store.

This is sometimes easier said than done. Typically a sysadmin will need to handle this. If you're using Heroku, you'll need to open a request to their support team to verify that the proper root certificate is deployed to their systems. There are some hackish workarounds that involve bundling the certificate in your deploy and configuring the application to read it from somewhere other than the OS' trust store.

2. Make sure the certificate(s) for Bonsai are up to date

Certificate Authorities typically do not issue permanent certificates. Most certificates expire every 1-2 years and need to be renewed. Some browsers and applications will cache SSL certificates to improve latency. It's possible to be using a cached certificate that has passed its expiration date, thus rendering it invalid. Flush your caches and make sure that the Bonsai certificate is up to date.

View code snippet
Close code snippet