Sunday 25 November 2012

App Engine and Google Apps Script

Update on SocialSamplr

Been a while since my last post, but I've been very busy developing our social media scoring platform so it's been hard to find the time to write much - but here's a new post on using memcache and big query on App Engine 
http://socialsamplr.blogspot.co.nz/2013/05/using-bigquery-and-memcache-to-improve.html.  Things have been progressing very well and socialsamplr is now on-boarded to the Creative HQ business incubator here in Wellington NZ and we're looking forward to accelerating things a lot from here on in.
We've been following the "lean startup" process and think we've got the information now to build a pretty cool and unique minimum viable product so stay tuned on how it progresses.  If anyone is interested in discussing what the product is going to offer or to even be beta testers please contact me at danshannon@socialsamplr.com.  Also, being a cash-strapped startup, if anyone's interested in the source code for the Google apps script stuff I was working on last year I'd be happy to provide this for a small fee along with some training data to create your own sentiment engine (kind of a "build your own sentiment engine starter pack").  Again, just drop me line and we can work something out (here's the link to see that in action https://script.google.com/macros/s/AKfycbxr5a4NUFBTVzxX3GJtW2tbiW3A6K5d06pVo4DqZD3fUT0_pITy/exec).


Getting Started in Python and App Engine

The new python site which I'll be hosting on app engine is still in progress but I've already got a couple of quick things to show so I thought I'd do a short initial post.

As a start I've installed Eclipse to use as the development environment with the Python and app engine extensions.  For the web development framework I've decided to go with Django (supported on app-engine), mysql database and am using the Twitter bootstrap css for styling.  So far it's proven very easy to get started using Python (being from a .NET background I've been very impressed so far with the intuitive nature of the language, although open-source development environments take a bit of getting used to after Visual Studio).  The django framework provides an MVC sort of pattern which is a great way to build web apps and also has plenty of built-in functionality like session management - which is limited to say the least in Google sites.  The dev environment is shown below:




An initial sample of the new site in progress is shown here, with the nice new "web 2.0" look that you get with the Twitter bootstrap (still getting my head around css a bit, but getting there!).





Integrating with Google Apps Script

So the first thing I wanted to try was see how it integrates with Google Apps Script so I can use script gadgets easily within the site.  The answer is very easy.  

The first thing you'll need to do is run a web application in Google App Engine to act as a proxy site for your Google Apps Script.  A sample application has been very kindly provided by "Eddy" as per Issue 546 (related to hosting google script gadgets on a custom domain).  See comment 104 on this issue to cover how to set up your proxy site application.  Once this is done you'll be able to host Apps Script gadgets within an iFrame on your app engine site - just replace the script.google.com part of the url with yoursite.appspot.com and you should be good to go.

Once you have set up your site to host apps script gadgets you can also then set up server-side communication between your app-engine site and google apps scripts.  On the python side this is very simple to call your apps script page - just see as follows using the built-in urlfetch method.


url = "https://script.google.com/a/macros/feedbacksampler.com/s/AKfycbzZRwgEAjK436jopUQvj9KLddO32-0Bji-332a57Rh73l6fcbc/exec?appkey=YOURAPPKEYHERE"

   
result = urlfetch.fetch(url)

Then in Apps Script by using the ContentService you can simply return the response as a string from the doGet function as follows - this will then be restored in the result variable seen above.  You see here we first check for the appkey from the query string to verify the requesting identity and then return a new value.  




This provides a very simple and secure way (all communication being via SSL) to communicate between an app engine site and Google Apps Scripts.  This is particularly good for being able to quickly develop and deliver functionality without having to redeploy your whole site.  In addition, it allows the use of all of the APIs available with Google Apps Script without having to code up all your authentication (for example Oauthing in first).  So long as all the requests are kept server-side it should be nice and secure.  Obviously, it will be slower than running executing everything within your python app, but does give some great options around flexibility.

So that's about it for now, will have some more stuff soon.  


Real-time Sentiment Scoring

As a footnote, I've created a new real-time sentiment scoring engine for any subject using data from Facebook, Google+ and Twitter and enriched with Klout.  Check it out on the existing website at http://www.socialsamplr.com.


Saturday 3 November 2012

Monitoring Facebook and Twitter Data

Update on SocialSamplr

Been a while since my last post, but I've been very busy developing our social media scoring platform so it's been hard to find the time to write much - but here's a new post on using memcache and big query on App Engine 
http://socialsamplr.blogspot.co.nz/2013/05/using-bigquery-and-memcache-to-improve.html.  Things have been progressing very well and socialsamplr is now on-boarded to the Creative HQ business incubator here in Wellington NZ and we're looking forward to accelerating things a lot from here on in.
We've been following the "lean startup" process and think we've got the information now to build a pretty cool and unique minimum viable product so stay tuned on how it progresses.  If anyone is interested in discussing what the product is going to offer or to even be beta testers please contact me at danshannon@socialsamplr.com.  Also, being a cash-strapped startup, if anyone's interested in the source code for the Google apps script stuff I was working on last year I'd be happy to provide this for a small fee along with some training data to create your own sentiment engine (kind of a "build your own sentiment engine starter pack").  Again, just drop me line and we can work something out (here's the link to see that in action https://script.google.com/macros/s/AKfycbxr5a4NUFBTVzxX3GJtW2tbiW3A6K5d06pVo4DqZD3fUT0_pITy/exec).

Monitoring Facebook and Twitter Data from Google Apps Script

This week I'll be covering how to set up automated monitoring for Facebook and Twitter data from Google Apps Script.  Then by using Google Charts, Big Query and Google Sites you can easily set up a web application to report on the data.

Set up your Twitter and Facebook Applications

First thing to do is set up your Twitter and Facebook applications (for Facebook see an earlier blog post).  This is so you can get the client ID and client secret you'll need to get OAuth tokens to read the data you want.  A couple of key points to note:
  • To set up Facebook monitoring of users' wall or news feed you'll need to request the additional read_stream permissions (passed in under "scope" in the query string).  Also, in my case I'm sending email notifications as part of the functionality I'm offering through my site so I request the email address as well.  So the request url winds up looking something like this (fbmonitoring being the page we're redirecting back to):

    https://www.facebook.com/dialog/oauth?client_id=xxxxxxxx&redirect_uri=https://sites.google.com/a/feedbacksampler.com/socialsampler/fbmonitoring&state=xxxxxxx&scope=read_stream,email
  • Make sure you read clearly and stick to the rules of the road for both Twitter and Facebook.  In particular, if you're using the rest API for Twitter you need to be aware of quota limits on the number of requests you make and how you process results.  Break the rules too much and you risk being blocked from using the API any further.

Running the Automated Monitoring

The automated monitoring for Facebook and Twitter share some similarities and, depending on the type of monitoring you're doing also have some differences.  Also there are some key restrictions to running "server-side" processes in Google Apps Script that you need to be aware of.  Mainly this is around the 5 minute execution timeout, inability to create a "hanging get" connection when connecting to another web server via UrlFetch and the odd unexplained error you receive (my personal favourite being "backend error").
In order to overcome some of these limitations I recommend following these guidelines.

  • Make use of the ScriptDb to track exactly where you are in your monitoring process so the next time the script runs it can just pick up from where it left off.  See the code below for an example of how I've achieved it.  I created a function which is run at the beginning of each script which moves to the last row which was processed from the previous run.

  • Utilise parallel processing for your apps script processes.  By this I mean for a given process (for example Twitter monitoring) due to the 5 minute execution time limit I can only monitor a very limited amount of Twitter data in that time.  By splitting my monitoring across a series of parallel processes (what I've called streams) I can utilise multiple apps scripts to do the monitoring.  Obviously this does result in a somewhat more complex structure to the scripts but it does build in a level of scalability.
  • Build in redundancy to your processes to ensure data is not missed when the delightful "backend error" is encountered.  By this I mean make each individual task you're performing atomic and log it to Script db so if there's a failure on the next task it'll know to re-run that task, but not those beforehand that were executed successfully.
  • By making heavy use of Script DB for automated processes make sure you have daily or weekly "clean-up" processes that run to ensure the database doesn't get too full.  Any real data you're storing should be in some other data store (in my case Big Query - again this is another batch process run through Google apps script - keeping it in the family).  The following is a good generic function to remove data from your Script DB as a batch process - in this case removing old Facebook sessions from the database.


Sample Code for Monitoring Twitter and Facebook

This section is basically just showing some straight-forward code to process tweets and facebook messages.  In my case I then use the content for sentiment analysis but can obviously be used for any other downstream process.  Note the content comes back as a JSON string which be parsed to get all the components returned (such as geo location etc.).  Note in the code below it uses the built in OAuth 1 support - passed in the "options" in the UrlFetchApp.fetch method.  For Facebook we're using the token received after authenticating through OAuth 2.  For a full breakdown of Facebook Query Language (FQL) which I use to get data from Facebook, check this link https://developers.facebook.com/docs/reference/fql/.  The urls sent to Facebook have the access token tacked on the end (not visible in the screen shot below).

Twitter:

Facebook:

Next post..

So that's it for now. The next post may be a bit delayed as I'm moving my application to Python and Google App Engine in preparation for a fully "commercial release" of my sentiment analysis engine - so there should be lots of interesting material to write about once that's done.  In the meantime all of the above can be seen working at www.socialsamplr.com.  

Over and out.
Footnote:  Nice to see the sentiment engine tracking our "foot-in-mouth" PM this week...http://goo.gl/ePXM5



Thursday 11 October 2012

Using Google Prediction

So this post is all about using Google prediction and some of the experiences I've been gaining so far in using it on my web application.  This is some pretty awesome technology and the way Google have exposed it and made it so easy to use should open up all sorts of opportunities for developers to get value out of it.

The Basics

So the basics of using Google's prediction services are pretty easy to follow and get up and running.  Just go to https://developers.google.com/prediction/ and follow the instructions to get up and running.  Remember, you'll need to sign up for billing for this service, although you do get a courtesy 10000 calls free a month which is plenty to play around with and you'll get nothing charged to your credit card until you go over that.

All you need as an initial sample is some dummy data saved into a csv file as per the screen shot below.  



Tip: The prediction engine IS case sensitive.  In other words you either have to give it two sets of data to train with (lower case and upper case) or give it one set and cast any text to the correct case when you want to run an actual prediction (which is what I've done).

Load the csv file into your Google cloud storage area and you're ready to train your model.  This is as easy as going to the prediction interface, giving your model and ID and putting in the cloud storage path.


Then you can easily try out your new model by choosing prediction.trainedmodels.predict and passing the ID (as above) and any text string you'd like to test.

So that's it, pretty simple so far.

Now It Gets Interesting....

So in it's simple form like above you can see how the prediction engine works and get some simple samples working.  Now, if you want to get serious about using this tool the key thing is above all is you're going to need data, lots and lots of data. 

What I'm building here is a sentiment analysis tool and trust me, the amount of time I've put in so far into building the model has been significant to say the least.  I've now gone in excess of 10000 and I'm fairly happy with how it's been going.  So far I've been purely manually training the model as I've wanted to ensure the quality of the model is not compromised.     

What's really cool with the prediction tool is each time you refresh your model you can run an "analyse" command to see how well your model is measuring up in the different categories you're training it in.
First, it gives you a break-down with some warnings of "stray" values in your source data which makes it easy to cleans your source data when it starts getting very large.  This is seen in the screen-shot below.



Then it also gives you an excellent break-down of how many samples you're providing of each label your scoring on.  This is key to ensure you don't "over-train" your model with some categories and not others.  Finally it gives a "confusion matrix" - see below.  This shows how close the different categories you have are to each other.  The closer the numbers between the categories the more chance they have of getting confused.




So far you can see how the general sentiment analysis works from my model at http://www.socialsamplr.com/prediction.   Type in a phrase (or copy and paste from anywhere) in the top text box and click the "Get Response" button to see what score it gives.  

In addition, I've been training my model across a range of categories which is more of a work-in-progress - this can be checked out here http://www.socialsamplr.com/lab.  

Sourcing Data

There is obviously a rich array of data from social media which you can source to train your model with.  Just make sure you stick on the right side of any rules when you do it.  For example, Twitter allow you to manually copy and paste data for scoring but any automated process where you're saving actual tweets to a cloud data source is against the developer rules of the road.

In short, there's not really a very easy way to build up a really good model for sentiment analysis.  It takes a lot of legwork and needs a lot of work on-going to ensure it remains accurate and relevant.  Other factors to be aware of are when first training your model with new data if the training data is biased towards positive or negative data the model will need to be continually updated over time to account for factors such as this (for example if a topic is having a bad week in the news this could skew any data you may use for that week).  However, when you do eventually get your model working well it's pretty cool to start applying it and seeing the results.

Using Prediction in Google Apps Script

This couldn't be easier.   Set a reference under Resources->Use Google APIs


The write your code...




Some Data Samples

So the way SocialSamplr so far tracks social media sentiment is on-going monitoring of topics by groups of hash-tags.  Although it's early days I've already seen some interesting data coming through on some of the topics being tracked as shown below.  Over time I'll hopefully be able to overlay these charts with current events occurring at the time and be able to derive some really interesting data mining samples.


Next Post...

So that's about it for this week, hope it's been interesting.  Next post will cover real-time monitoring of Facebook data (which I'll hopefully complete testing on soon) and streaming of Twitter data using Python on app-engine.  Maybe a chat about big-query as well.  Any questions or feedback in the meantime, just let me know.


Thanks

Daniel

Wednesday 26 September 2012

Using the charts and dashboard

Hi, here's my next post on using the dashboard charts from Google in the design of a site.  They're a very easy-to-use feature and give you a great way of quickly displaying data in a user-friendly format.


Basics

So the basics of using dashboard charts are pretty nicely covered in this article which I would advise to read first https://developers.google.com/apps-script/articles/charts_dashboard.  They're pretty easily created and then rendered onto the screen and CSS can be applied to give them some extra formatting.

For the purpose of this article I'm going to show how I've implemented them and in particular how I generated a comparative chart from 2 separate queries as shown in this screen shot where I'm running a comparative set of data.


Querying and Preparing the Data

So for my instance what I've generated is some dynamic charts which are based of data from a big query database.  There are three types of charts:


  • Pie chart for a straight summary of data
  • Bar and line charts for a "timeline" based chart of data
  • In addition bar and line charts can be used for comparative analysis between two data sets
First thing to do is query my big-query tables.  In the code I'll show I'll cover the example where I produce the comparative charts as that's the most interesting.  First, run the queries:

Interesting points to see from the code above.  First I declare two arrays - this is because the data I get back from the queries is absolute numbers and I have to do the conversion to percentages for the charts.  Also, the actual comparisons between the data sets will be from those arrays so performance is critical when you need to do subsequent data processing - and where you need performance arrays are the way to go.  Also, note the last piece of code checking if there's no data to display - if not we can then give the user a nice, friendly message.

The next bit of code is probably more difficult to make sense of but here is where I populate the array with the data from the query in percentages by keep a running total of all the values.  

So now that we have everything in the two arrays from our two queries we can really get the benefit of using arrays when we need to loop through and "join" the two sets of data based on date.  




So now we've done all the hard work we can populate our charts with the data.  Note that for the bar and line charts I pass in 3 columns of data so the charts can be shown alongside each other - you can add any number of columns in so could do even more comparisons if that's what you want.  Notice also in the last section of code I create a shorten URI using the built-in Google API so people can easily share the chart.



As always, to see this in action "for real" just check out www.socialsamplr.com

Until next time...
My next post may not be for a couple of weeks as I'm off to *hopefully* catch some trout up at lake Taupo next weekend so it'll have to wait until after that.  The next post will be on my experiences developing my own sentiment analysis engine using Google prediction.

Till then, all the best

cheers

Daniel



Monday 17 September 2012

Creating a good user experience

Creating a Compelling User Experience - part 1.

Hi, in the previous post I covered setting up Facebook logins for your Google site using OAuth 2.0.  This post will cover more around building up a nice, responsive, usable UI.  I have to confess this is not normally my number one area but for the purpose of my start-up it's something I'm having to get my head around (with some much-appreciated assistance).  Hence, this is part 1 of my journey with this but I think I've already got some interesting things to show.


Basics

The basics of building a good user experience with Google sites and Apps script gadgets is to adhere to the following:

  • Maximise the use of client events wherever possible - and I mean really maximise them, the less round trips to the server you can make, the better user experience you'll provide.
  • Make sure when you do a round-trip to the server you make it obvious to the user some processing is happening if you don't want them randomly clicking on other buttons on the screen.  Remember, GAS gadgets all run asynchronously on the server so you can get some funny behaviour if you're not careful.
  • Also, if you have really long running operations be aware of mechanisms like cache services (I haven't really had the need to use this yet but it's there if needed).  
  • All these things are also summarised here  https://developers.google.com/apps-script/best_practices  but hopefully I can give some "real-life" examples.
  • For appearances I've found it best to use in-line CSS which I'll also demo
  • Finally I make extensive use of the built in charts that come with Google Apps which are pretty awesome.  I won't go into a lot of details in this post on those, will leave them for a future post since they would have enough material to warrant it.

My Example

To see how most of this works in real-life just go have a look at www.socialsamplr.com or check out the screenshot.


Step 1 - Create the controls and make them look pretty

The first task I had was to load the controls on the page and make them look pretty - at least to me (if you think they're fugly let me know I'm always open to feedback).  This is best done by applying in-line css.  The code I wrote to achieve this is as follows.  First, create a bunch of controls.


Then apply in-line CSS to them:


And here's a sample of the functions applying the styles.



And that's pretty much it for nice looking controls.


Step 2 - Use client events to improve the user experience

This step takes a bit more planning, but trust me it's worth it.  Using client events extensively is key to making the screen responsive to the user and creating a nice experience.

The first thing is simple, but effective.  On the screen shot shown earlier from my web app, you'll see there's two groups of drop-downs and buttons.  The lower group can be shown and hidden by toggling the label you see just above it (the label saying "Hide").  This is simply done by adding a client click event to that label which sets the visible property of those controls to false or true. A very simple but effective technique for preventing your pages being overly cluttered with confusing controls for the user.  Basically just done like this:


You'll see you create an event for the control itself forEventSource(), the one for any other controls on the screen as well forTargets.  Note, multiple forTargets can be set.

The next bit is really cool.  You can create client and server events for a control.  This is key if you want to show the user something is processing and prevent them from randomly clicking buttons (remember events are asynchronous) and stuffing up your beautiful screen you've spent all those late nights developing.  Also, it enables you to create some call Ajax-like functionality if you want.  So for the buttons on the website, here's how I achieved it.

I create a client event for a button where it sets it's own text to "Working" and also sets the background colour of any other buttons to grey.  This will then all be reset to the original colours and text once the server side event is complete.  Code is shown as follows.  First set the client events on the buttons.

  buttonRun.addClickHandler(uiApp.createClientHandler().forEventSource().setText("Working...").forTargets(buttonRunTimeline).setStyleAttribute("background-color","#C0C0C0"));

This then gets repeated for the remaining buttons, referencing the others.  For example buttonMySubject would be.

buttonRunTimeline.addClickHandler(uiApp.createClientHandler().forEventSource().setText("Working...").forTargets(buttonRun).setStyleAttribute("background-color","#C0C0C0"));

Then proceed to attach a server event to each button and make sure after the server events are run the following function is called to reset the buttons to their original state.  This is shown as follows.



Step 3 - Other tips

Some other pointers I've found helpful when it comes to reducing the server-side processing time for events.

  • Where you need to do any complex data processing use arrays - makes a big difference than iterating through rows from a ScriptDB table.
  • Where you want to change controls on a screen, load the controls from doGet() and then toggle their visibility for improved performance - much better than adding and removing controls dynamically.
  • Consider tricks like using some hard-coded initiation values for your screen when users first arrive.  The improved load-time performance can make all the difference to first time users (obviously there's a trade off with dynamically generated values).

In Summary

In summary, I'm not going to pontificate about how to make the world's most awesome UI - there's plenty of books around for that.  But for using GAS in Google Sites I've found some of these tricks have really worked.  Also, if you have time to visit www.socialsamplr.com and have any feedback I'd really appreciate it.

Next Week

Next week I'll run you through how I've created the dynamic charting on the site.  Until then, don't work too hard!

cheers

Daniel


Monday 10 September 2012

Authenticating with Facebook and Google Apps

OAuth2 Authentication for Facebook with Google Apps Script

Hi, here is a quick run-down on how I implemented OAuth 2.0 authentication via Facebook with my Google site using Apps script.  In the end it was all fairly straightforward and if anyone wants to see it in "action" you can visit my site at www.socialsamplr.com (shameless plug there).  

In the end, it was reasonably straightforward, so here goes.

Step 1 - Create your Facebook app

To do this just follow the basic guidelines at developer.facebook.com.  Note, to enable your facebook authentication to work for anyone using your site, your site will need to provide links to pages covering your privacy policy, contact details etc. about your site.  For developer testing you can run in a "sandbox" mode so won't need to provide these.  

Once done your Facebook app will look something like this.


This will then provide you with the Client ID and Client Secret you need for authentication.  Note the URLs you provide will need to match those for your site.

Step 2 - Create the authentication link in your site

To do this you need to create a Google Apps Script widget to embed in your site.  The simplest way is to create an anchor link with your client id and a unique session string which you will need to create.  This is to prevent other sites attempting to authenticate back to your site by using your session string.  In my case I create an encrypted string with some random characters appending to a date value set to the system time (and expire the string after 2 hours).  This is then stored in the script DB so when Facebook redirects back, I can use it to create the account.  The code is as follows:


The underlying Authentication.createFacebookSession() code is as follows.  Note this is the string that will be checked when it's redirect back to from Facebook on authentication.

Step 3 - Create the script for the redirection page on your site

Now that you've presented the link to the user on your site to authenticate via Facebook you need to create a widget for the page you've passed to Facebook as your redirection page (see in the link created in the anchor in step 2).  Note this gadget will need to accept the parameters passed in through the site page.  This is done through providing the arguments in the doGet() function for the page.  The first thing to process is the state parameter (the same one you created in step 2) so that you know the redirect is a valid one from your site.  Once you have these you just cycle through the parameters to get back the necessary access token from Facebook if the user has approved the request and then the user details.  Code is as follows.

Getting the parameters passed back in through the page.

Amendment to screenshot below - the following line (sorry the code was edited slightly to make it more readable and I introduced the error - doh!):

var facebookUserName=getFacebookUserName(facebookUser);

should read:


var facebookUserName=getFacebookUserName(facebookUserDetails);



And the functions being called are as follows.

And that's pretty much it.  Enough details there to get OAuth2.0 for Facebook working on your Google sites.  One note is to say the pages you have this access on will need to be under the https://sites.google.com domain (not a custom domain).  For my site I've just put most of my site under the custom domain and these specific pages in the sites.google.com domain and the feedback from most people so far is pretty positive.  There is an outstanding issue around custom domains for websites (546) to track if there's any changes here.

Any other questions, just contact me on Google +.

Cheers.

Daniel

Footnote

Just a quick note - apparently issue 546 is now fixed which is great news.  This means the authentication should be able to take place with the custom domain and not site.google.com....  I'll test this week and confirm it works.