Installing Elastic Search
Ubuntu
Dependencies
The easiest option is to install OpenJDK as follows:
sudo apt-get install openjdk-6-jre
Elastic search actually recommends using Oracle Java which can be installed as follows
sudo add-apt-repository -y ppa:webupd8team/java
sudo apt-get update
sudo apt-get -y install oracle-java8-installer
Installation
- Obtain the GPG key for the source
wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
- Add the source to ubuntu's sources list
echo 'deb http://packages.elasticsearch.org/elasticsearch/1.4/debian stable main' | sudo tee /etc/apt/sources.list.d/elasticsearch.list
- Update the package list from the sources list
sudo apt-get update
- Install the Elastic Search package
sudo apt-get -y install elasticsearch=1.4.4
Configuration
Configuration can go on extensively depending upon what you require. Out of the box is good enough to get you stared with the following change.
- Edit the configuration file with a text editor (using vim for this example)
sudo vim /etc/elasticsearch/elasticsearch.yml
- Search for the line that contains `network.host` and change it to the following (you may just need to uncomment it). This will only allow connection from the local computer.
network.host: localhost
Running the Service
To start the service use:
sudo service elasticsearch restart
To start the service at boot:
sudo update-rc.d elasticsearch defaults 95 10
Installation
- Assuming that brew is installed
brew install elasticsearch
Configuration
Configuration can go on extensively depending upon what you require. Out of the box is good enough to get you stared with the following change.
- Edit the configuration file with a text editor (using vim for this example)
sudo vim /usr/local/opt/elasticsearch/config/elasticsearch.yml
- Search for the line that contains `network.host` and change it to the following (you may just need to uncomment it). This will only allow connection from the local computer.
network.host: localhost
Running the Service
To start the service use:
elasticsearch --config=/usr/local/opt/elasticsearch/config/elasticsearch.yml
To start the service at boot:
ln -sfv /usr/local/opt/elasticsearch/*.plist ~/Library/LaunchAgents
Time to get it running with Django!
Okay cool we have Elastic Search running and I have a Django project.... now what?
This comes down to entirely what you need to do with your search. If you need very standard search that you don't have to customise or care too much about configuring, `django-haystack` is probably the way for you. If you need to customise what you need heavily and want to take advantage of specific features of Elastic Search then `elasticsearch` and `elasticsearch_dsl` are probably what you are looking for.
`django-haystack` allows the search backend (such as elastic search) to be swapped out. Because of this search backend specific functionality is not always supported but common search functionality are.
We will look at using `django-haystack` now.
Django Haystack Installation
- Install using `pip`:
pip install django-haystack
- Add the app to `INSTALLED_APPS` in `settings.py`:
INSTALLED_APPS += ['haystack', ]
- Add configuration to `settings.py`:
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
'URL': 'http://127.0.0.1:9200/',
'INDEX_NAME': 'haystack',
},
}
- Add search views to your URL conf by adding the following URL pattern:
url(r'^search/', include('haystack.urls')),
Django Haystack Installation
A template must be defined for the search view to use. It expects the template to be location at `search/search.html`. An example Django Haystack provides in their documentation to get you started is as follows.
{% extends 'base.html' %}
{% block content %}
<h2>Search</h2>
<form method="get" action=".">
<table>
{{ form.as_table }}
<tr>
<td></td>
<td>
<input type="submit" value="Search">
</td>
</tr>
</table>
{% if query %}
<h3>Results</h3>
{% for result in page.object_list %}
<p>
<a href="{{ result.object.get_absolute_url }}">
{{ result.object.title }}
</a>
</p>
{% empty %}
<p>No results found.</p>
{% endfor %}
{% if page.has_previous or page.has_next %}
<div>
{% if page.has_previous %}
<a href="?q={{ query }}&page={{ page.previous_page_number }}">
{% endif %}
« Previous
{% if page.has_previous %}
</a>
{% endif %}
|
{% if page.has_next %}
<a href="?q={{ query }}&page={{ page.next_page_number }}">
{% endif %}
Next »
{% if page.has_next %}
</a>
{% endif %}
</div>
{% endif %}
{% else %}
{# Show some example queries to run, maybe query syntax, something else? #}
{% endif %}
</form>
{% endblock %}
Django Haystack Integrating with Models
Django Haystack uses `SearchIndexes` (which are similar to Django models in terms of field based storage) to determine which data should used for searching.
Search indexes are generally created for each model type but can occasionally be used across many similar model types.
It is easiest to put your search indexes in a file named `search_indexes.py` inside your app as this allows Django Haystack to automatically detect it.
Django Haystack Search Index
If we have the following model:
from django.db import models
from django.utils import timezone
class TestModel(models.Model):
title = models.CharField(max_length=255)
body = models.TextField()
author = models.ForeignKey(settings.AUTH_USER_MODEL)
publication_date = models.DateTimeField(default=timezone.now)
Django Haystack Search Index
The following index would be applicable
from django.utils import timezone
from haystack import indexes
from myapp.models import TestModel
class TestModelIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
author = indexes.CharField(model_attr='author')
pub_date = indexes.DateTimeField(model_attr='publication_date')
def get_model(self):
return TestModel
def index_queryset(self, using=None):
"""Used when the entire index for model is updated."""
return self.get_model().objects.filter(publication_date__lte=timezone.now())
As we used `use_template=True` on the `text` index we must provide a template for it to use. If we don't provide a template and configure which outputs we want it will concatenate the field values.
Django Haystack Search Index
The template for an index follows the following pattern:
search/indexes/[the app name]/[model name used for the index]_[index field name].txt
So for the previous example the template would exist at:
search/indexes/myapp/testmodel_text.txt
Each of the queryset objects has the previous template rendered with the object referenced (in this case a `TestModel` instance) in the context as object, so we can output the relevant fields like as follows.
{{ object.title }}
{{ object.author.get_full_name }}
{{ object.body }}
Building and Rebuilding your index
To build your index run the following management command:
./manage.py rebuild_index
To update your index run the following management command:
./manage.py update_index
The update index command can be scheduled with a cron job to run at set intervals on your site.
You now have basic search!
Comments