It turns out that Google App Engine DOES have support for full-text search, it's just not documented, because the feature is still in development.
When App Engine first arrived, a lot of people, including myself, was baffled at the lack of full-text search in the DataStore API. What the fudge - Google is THE full-text search company, and their database solution does not have support for Full-text indexing!
The DataStore is built on top of Googles BigTable, which is a huge-arse database that powers a lot of projects at Google, including Search Indexing. Yes, the insanely limited, strange Data Storage is what Google is using to power their blazingly fast search engine.
The Google App Engine API has a very primitive implementation of a full-text index for the datastore, hidden away in google.appengine.ext.search. (There is basically no documentation of it, so you have to read the source, lazy boy) You use it by creating your models from search.SearchableModel, instead of the usual db.Model.
Like this:Limitations
This is basically just "find entries that contains these words" - it has no exact phrase match, substring match, boolean operators, stemming, or other common full-text features.
The nitty gritty
Save latency The philophy behind the Data Store is to make use of the fact that disk space is cheap, and perform and store calculations when a piece of data is stored. This applies to SearchableModels as well - they create the index for the entity when Save() is called. This means that instances created from SearchableModels take slightly longer to save than standard models. Keep this in mind.Index of the index As you might now, The Google App Engine SDK generates indexes in index.yaml for all queries that you run while you are developing the app. However, since you might not be running all the imaginable cases of queries while you are developing, the index.yaml might be inadequate, and need to be manually appended with indexes. In these cases, you need to know that the full-text index is placed in a propertly called __searchable_text_index. To add indexes for it, the full-text index property:
- kind: Article
properties:
- name: __searchable_text_index
- name: publishDate
direction: desc
There you go! Full-text indexing on App Engine. Not perfect at all, but it works for a lot of scenarios!
8 kommentarer:
Thanks for sharing this, Mattias.
No problem at all, Aral. Thanks for the comment, and thanks for reading.
Hi there just to let peeps know you can also add on a fetch() method to restrict the amount of results you get from your search query.
Bakery.all().search(keyword).order("-date").fetch(limit, offset)
Cheers,
Biffer
Thanks for this gave it a twirl and works perfectly. However the google app engine team still have alot of development work to do on the GQL.
Does it work with Django Helper?
There are also SearchableEntity and SearchableQuery if you dig into http://code.google.com/appengine/articles/bulkload.html
Plus, cannot set stop words yet.
BTW, can i change UI language to english. Hard to find the link of comments as all are in Deutch
My blog on google app engine search api, which refered to your blog:
http://www.cnblogs.com/kuber/archive/2008/07/23/1249617.html
Thank you very much. It was very useful for the project I'm working on right now.
Skicka en kommentar