Machine Learning & Big Data Blog

How To Use Elastic Enterprise Search with GitHub

5 minute read
Walker Rowe

Elastic Company has acquired Swiftype for its product portfolio, branding it Elastic Enterprise Search. This product gives users the ability to query a variety of data sources, including public sources and internal company documents and data sources.

We previously explained how to install Enterprise Search. In this article, I’ll illustrate how it works by connecting it to GitHub.

Overview: How Elastic Enterprise Search works

Enterprise Search offers the ability for users to query data sources using natural language. It is particularly useful within organizations who share internal documents. Popular sources you can query with Enterprise Search include:

  • Dropbox
  • Google Docs
  • GitHub
  • Microsoft OneDrive
  • Jira
  • Salesforce
  • Custom sources (via APIs)

Enterprise Search works by indexing search data in ElasticSearch and connecting to the data source using OAuth, an industry standard for authenticating apps. To understand OAuth, I like it to when you use Facebook or Google credentials to look into an app.

Note on GitHub limitations

You cannot use Enterprise Search with your own personal GitHub repository. Instead you must use an organizational repository. In other words, if you are an employee named Fred working at Smith Airlines, then you can search Smith Airlines. You cannot search Fred. That makes sense since Enterprise Search is designed for an enterprise and not a single individual.

Setting up Elastic Enterprise Search

Follow these steps to set up Elastic Enterprise Search.

  1. Create an OAuth App in GitHub. This is where you define the callback URLs that points to your Enterprise Search Installation. It also creates the Client ID and Client Secret needed to connect to Enterprise Search.
  2. Create the GitHub source in Enterprise Search.
  3. Enterprise Search polls GitHub for activity.
  4. Start searching.

Configuring GitHub OAuth Settings

Login to GitHub and click on Settings –> Developer setting for the repository. Make sure you click the organizational repository settings and not your personal settings.

In this example the repository is walkerrowe:

Go to Developer settings then create a New OAuth App.

Give it a name. For the callback URL, use these links:

Homepage URL https://(your server):3002
Authorization callback URL http://(your server):3002/ent/

Note: the Swiftype documentation mentions localhost. Do not use that. (GitHub cannot reach your localhost.) Instead, it must be the public IP address of your Enterprise Search server or the private IP if you are running GitHub internally. You will need to open firewall port 3002.

Click Register Application then note the client ID and client secret. You will put those credentials into Enterprise Search.

Add GitHub Source in Elastic Enterprise Search
Click on Add a Source.

Select GitHub.

Then follow the screens. If you are already logged into GitHub, it will try to use those credentials. So, logout of GitHub.

Fill in the client ID and secret. You don’t put the URL like github.com/(your organization). Instead GitHub locates your repository by your client ID.

As you would see if you are logging into some application using Facebook or Google, GitHub asks you for permission to connect the two. If you get any error message here, check the callback URL you put above. GitHub needs to be able to reach that from the GitHub servers.

Click through this screen.

Changing configuration and handling debug errors

If you make a mistake, don’t click on “Add a source” again. Instead, go into settings in Enterprise Search, also located on the left-hand menu.

Then select the configure button shown below

Verifying your connection works

You should see some activity now:

Searching

Oddly enough, the search screen in Enterprise Search is hidden. It’s not on the main landing http://(your server):3002. Instead, look on the left-hand side for Go to Search Application.

Their search syntax is natural language, but you do need to use certain keywords (see Help with the Search Syntax). It’s not well documented, yet.

When I type:

creator is walkerrowe

 

It shows these objects:

Then I typed the name of a repository I created, esearch. It presented this screen. Click on the item and it gives you the chance to look at it in GitHub.

You can refer to the Enterprise Search Searcher’s Manual for search syntax, but it gives very few examples. For example, it says that, as you type a search question, it highlights words that it finds in blue. That did not work for me using Chrome on Mac. It also seems to search files but not the content of files. In other words, it’s not indexing every word in your Google docs or Sheets.

Since the documentation is sparse, consider asking questions on the Enterprise Search community.

Learn ML with our free downloadable guide

This e-book teaches machine learning in the simplest way possible. This book is for managers, programmers, directors – and anyone else who wants to learn machine learning. We start with very basic stats and algebra and build upon that.


These postings are my own and do not necessarily represent BMC's position, strategies, or opinion.

See an error or have a suggestion? Please let us know by emailing blogs@bmc.com.

BMC Bring the A-Game

From core to cloud to edge, BMC delivers the software and services that enable nearly 10,000 global customers, including 84% of the Forbes Global 100, to thrive in their ongoing evolution to an Autonomous Digital Enterprise.
Learn more about BMC ›

About the author

Walker Rowe

Walker Rowe is an American freelancer tech writer and programmer living in Cyprus. He writes tutorials on analytics and big data and specializes in documenting SDKs and APIs. He is the founder of the Hypatia Academy Cyprus, an online school to teach secondary school children programming. You can find Walker here and here.