Sunday 25 November 2012

App Engine and Google Apps Script

Update on SocialSamplr

Been a while since my last post, but I've been very busy developing our social media scoring platform so it's been hard to find the time to write much - but here's a new post on using memcache and big query on App Engine 
http://socialsamplr.blogspot.co.nz/2013/05/using-bigquery-and-memcache-to-improve.html.  Things have been progressing very well and socialsamplr is now on-boarded to the Creative HQ business incubator here in Wellington NZ and we're looking forward to accelerating things a lot from here on in.
We've been following the "lean startup" process and think we've got the information now to build a pretty cool and unique minimum viable product so stay tuned on how it progresses.  If anyone is interested in discussing what the product is going to offer or to even be beta testers please contact me at danshannon@socialsamplr.com.  Also, being a cash-strapped startup, if anyone's interested in the source code for the Google apps script stuff I was working on last year I'd be happy to provide this for a small fee along with some training data to create your own sentiment engine (kind of a "build your own sentiment engine starter pack").  Again, just drop me line and we can work something out (here's the link to see that in action https://script.google.com/macros/s/AKfycbxr5a4NUFBTVzxX3GJtW2tbiW3A6K5d06pVo4DqZD3fUT0_pITy/exec).


Getting Started in Python and App Engine

The new python site which I'll be hosting on app engine is still in progress but I've already got a couple of quick things to show so I thought I'd do a short initial post.

As a start I've installed Eclipse to use as the development environment with the Python and app engine extensions.  For the web development framework I've decided to go with Django (supported on app-engine), mysql database and am using the Twitter bootstrap css for styling.  So far it's proven very easy to get started using Python (being from a .NET background I've been very impressed so far with the intuitive nature of the language, although open-source development environments take a bit of getting used to after Visual Studio).  The django framework provides an MVC sort of pattern which is a great way to build web apps and also has plenty of built-in functionality like session management - which is limited to say the least in Google sites.  The dev environment is shown below:




An initial sample of the new site in progress is shown here, with the nice new "web 2.0" look that you get with the Twitter bootstrap (still getting my head around css a bit, but getting there!).





Integrating with Google Apps Script

So the first thing I wanted to try was see how it integrates with Google Apps Script so I can use script gadgets easily within the site.  The answer is very easy.  

The first thing you'll need to do is run a web application in Google App Engine to act as a proxy site for your Google Apps Script.  A sample application has been very kindly provided by "Eddy" as per Issue 546 (related to hosting google script gadgets on a custom domain).  See comment 104 on this issue to cover how to set up your proxy site application.  Once this is done you'll be able to host Apps Script gadgets within an iFrame on your app engine site - just replace the script.google.com part of the url with yoursite.appspot.com and you should be good to go.

Once you have set up your site to host apps script gadgets you can also then set up server-side communication between your app-engine site and google apps scripts.  On the python side this is very simple to call your apps script page - just see as follows using the built-in urlfetch method.


url = "https://script.google.com/a/macros/feedbacksampler.com/s/AKfycbzZRwgEAjK436jopUQvj9KLddO32-0Bji-332a57Rh73l6fcbc/exec?appkey=YOURAPPKEYHERE"

   
result = urlfetch.fetch(url)

Then in Apps Script by using the ContentService you can simply return the response as a string from the doGet function as follows - this will then be restored in the result variable seen above.  You see here we first check for the appkey from the query string to verify the requesting identity and then return a new value.  




This provides a very simple and secure way (all communication being via SSL) to communicate between an app engine site and Google Apps Scripts.  This is particularly good for being able to quickly develop and deliver functionality without having to redeploy your whole site.  In addition, it allows the use of all of the APIs available with Google Apps Script without having to code up all your authentication (for example Oauthing in first).  So long as all the requests are kept server-side it should be nice and secure.  Obviously, it will be slower than running executing everything within your python app, but does give some great options around flexibility.

So that's about it for now, will have some more stuff soon.  


Real-time Sentiment Scoring

As a footnote, I've created a new real-time sentiment scoring engine for any subject using data from Facebook, Google+ and Twitter and enriched with Klout.  Check it out on the existing website at http://www.socialsamplr.com.


Saturday 3 November 2012

Monitoring Facebook and Twitter Data

Update on SocialSamplr

Been a while since my last post, but I've been very busy developing our social media scoring platform so it's been hard to find the time to write much - but here's a new post on using memcache and big query on App Engine 
http://socialsamplr.blogspot.co.nz/2013/05/using-bigquery-and-memcache-to-improve.html.  Things have been progressing very well and socialsamplr is now on-boarded to the Creative HQ business incubator here in Wellington NZ and we're looking forward to accelerating things a lot from here on in.
We've been following the "lean startup" process and think we've got the information now to build a pretty cool and unique minimum viable product so stay tuned on how it progresses.  If anyone is interested in discussing what the product is going to offer or to even be beta testers please contact me at danshannon@socialsamplr.com.  Also, being a cash-strapped startup, if anyone's interested in the source code for the Google apps script stuff I was working on last year I'd be happy to provide this for a small fee along with some training data to create your own sentiment engine (kind of a "build your own sentiment engine starter pack").  Again, just drop me line and we can work something out (here's the link to see that in action https://script.google.com/macros/s/AKfycbxr5a4NUFBTVzxX3GJtW2tbiW3A6K5d06pVo4DqZD3fUT0_pITy/exec).

Monitoring Facebook and Twitter Data from Google Apps Script

This week I'll be covering how to set up automated monitoring for Facebook and Twitter data from Google Apps Script.  Then by using Google Charts, Big Query and Google Sites you can easily set up a web application to report on the data.

Set up your Twitter and Facebook Applications

First thing to do is set up your Twitter and Facebook applications (for Facebook see an earlier blog post).  This is so you can get the client ID and client secret you'll need to get OAuth tokens to read the data you want.  A couple of key points to note:
  • To set up Facebook monitoring of users' wall or news feed you'll need to request the additional read_stream permissions (passed in under "scope" in the query string).  Also, in my case I'm sending email notifications as part of the functionality I'm offering through my site so I request the email address as well.  So the request url winds up looking something like this (fbmonitoring being the page we're redirecting back to):

    https://www.facebook.com/dialog/oauth?client_id=xxxxxxxx&redirect_uri=https://sites.google.com/a/feedbacksampler.com/socialsampler/fbmonitoring&state=xxxxxxx&scope=read_stream,email
  • Make sure you read clearly and stick to the rules of the road for both Twitter and Facebook.  In particular, if you're using the rest API for Twitter you need to be aware of quota limits on the number of requests you make and how you process results.  Break the rules too much and you risk being blocked from using the API any further.

Running the Automated Monitoring

The automated monitoring for Facebook and Twitter share some similarities and, depending on the type of monitoring you're doing also have some differences.  Also there are some key restrictions to running "server-side" processes in Google Apps Script that you need to be aware of.  Mainly this is around the 5 minute execution timeout, inability to create a "hanging get" connection when connecting to another web server via UrlFetch and the odd unexplained error you receive (my personal favourite being "backend error").
In order to overcome some of these limitations I recommend following these guidelines.

  • Make use of the ScriptDb to track exactly where you are in your monitoring process so the next time the script runs it can just pick up from where it left off.  See the code below for an example of how I've achieved it.  I created a function which is run at the beginning of each script which moves to the last row which was processed from the previous run.

  • Utilise parallel processing for your apps script processes.  By this I mean for a given process (for example Twitter monitoring) due to the 5 minute execution time limit I can only monitor a very limited amount of Twitter data in that time.  By splitting my monitoring across a series of parallel processes (what I've called streams) I can utilise multiple apps scripts to do the monitoring.  Obviously this does result in a somewhat more complex structure to the scripts but it does build in a level of scalability.
  • Build in redundancy to your processes to ensure data is not missed when the delightful "backend error" is encountered.  By this I mean make each individual task you're performing atomic and log it to Script db so if there's a failure on the next task it'll know to re-run that task, but not those beforehand that were executed successfully.
  • By making heavy use of Script DB for automated processes make sure you have daily or weekly "clean-up" processes that run to ensure the database doesn't get too full.  Any real data you're storing should be in some other data store (in my case Big Query - again this is another batch process run through Google apps script - keeping it in the family).  The following is a good generic function to remove data from your Script DB as a batch process - in this case removing old Facebook sessions from the database.


Sample Code for Monitoring Twitter and Facebook

This section is basically just showing some straight-forward code to process tweets and facebook messages.  In my case I then use the content for sentiment analysis but can obviously be used for any other downstream process.  Note the content comes back as a JSON string which be parsed to get all the components returned (such as geo location etc.).  Note in the code below it uses the built in OAuth 1 support - passed in the "options" in the UrlFetchApp.fetch method.  For Facebook we're using the token received after authenticating through OAuth 2.  For a full breakdown of Facebook Query Language (FQL) which I use to get data from Facebook, check this link https://developers.facebook.com/docs/reference/fql/.  The urls sent to Facebook have the access token tacked on the end (not visible in the screen shot below).

Twitter:

Facebook:

Next post..

So that's it for now. The next post may be a bit delayed as I'm moving my application to Python and Google App Engine in preparation for a fully "commercial release" of my sentiment analysis engine - so there should be lots of interesting material to write about once that's done.  In the meantime all of the above can be seen working at www.socialsamplr.com.  

Over and out.
Footnote:  Nice to see the sentiment engine tracking our "foot-in-mouth" PM this week...http://goo.gl/ePXM5