Queuing Tweets with Amazon’s (AWS) SQS

Like many services, we wanted to enable social sharing on Twitter (and other platforms like Facebook) at the launch of PersistentFan.

Initially, our interface with Twitter worked reliably and allowed people to tweet about interesting videos. A basic flow diagram looked like this:

However, once we opened up PersistentFan, the increase in traffic (which wasn’t a tidal wave) we noticed a lot of tweets weren’t showing up on Twitter.

Our first attempt was to simply add a basic retry mechanism, which improved things, but only slightly. Our second attempt was to retry multiple times. The flow diagram turned into something like this:

 

Unfortunately, several times this led to site performance issues as server threads were busy trying to deliver a tweet instead of handling an inbound HTTP request

I had solved a similar problem (at a much larger scale) at MessageCast by adding message queuing. Implementing a full blown system with ActiveMQ etc seemed like overkill so I took a look at AWS SQS. It was really easy (and quick) to build a basic prototype. After looking at the typica library, I ended up choosing the AWS library and was off and running.

Tweet delivery logic has now changed to:

 

Overall, things were up and running pretty quickly and we’ve had very few issues. Currently, the AWS console doesn’t allow you to peek into a given queue and manage it (i.e. modify, delete etc) but this is supposed to be added in the near future.

Here’s some basic code for sending and getting to/from a queue:

Sending a message to the queue:

SendMessageRequest request = new SendMessageRequest();
request.setQueueUrl(<QueueURL>);
request.setMessageBody("This is my message text.");
invokeSendMessage(service, request);

Getting a message from the queue:

ReceiveMessageRequest request = new ReceiveMessageRequest();
request.setQueueUrl<QueueURL>;
invokeReceiveMessage(service, request);

Pricing is quite reasonable as well – 10k requests is $0.01

If you’re looking for a queuing solution, I’d recommend taking SQS for a spin.

Iterating in the Open

Over the years I’ve created, been an Advisor to and invested in online services. There have been various launch strategies from “let’s release as soon as we have these ‘n’ features”, “let’s release when we have feature parity with competitor ‘x’” and “hell, let’s release what we have and iterate as we go”.

There are pros and cons to each of the above three strategies. Based on my experience, the first and third ideas are workable, but waiting for feature parity is equal to never actually shipping anything (I’ve seen this several times).

When Mike and I came up with the idea for what is now PersistentFan (starting as an FB app, “top3Clicks”) we were determined to employ strategy number three, iterating in the open. The vision behind PersistentFan was to create a fan-oriented site where we could create a system that would programmatically acquire content about our various niche (or even micro-niche … is that an actual term??) areas of interest, notify us and enable sharing with our friends. We started off with something easy – video. Obviously, there are other types of content for a given area (news, blogs, photos, audio, etc) but YouTube had great APIs to get the ball rolling.

We aimed to iterate several times a week, pushing new features and bug fixes (here’s a sample). As the weeks flew by, the functionality of the site would increase bit by bit until we had a full featured offering. (Note that we did start off in bare-bones, early alpha invite-only mode). Feedback from friends (initially) and as the site grew, external users would help guide both the features and the priority of the features we shipped. Not crowd sourcing as described in Don Tapscott’s “Wikinomics” but instead open iteration, warts and all.

It’s been several months since we had the first user take the site for a spin and so far, here’s what we’ve found:

The good parts:

  • The ability to evolve service based on reality (i.e. user feedback, actual utilization)
  • Feedback based on usage instead of looking at a PowerPoint slide
  • Users pushing up the priority of a feature (we’ve had the request for a “Forgot My Password” several times now *cough*)

The not so good parts:

  • A new visitor to the site may not see value on initial visit and never return (the “is that all there is?” problem)
  • Bugs, bugs, bugs. We have a staging environment and plenty of automated tests, but when you’re running fast …

Stuff we’ve learned along the way:

  • Iterating in the open is a net positive, but users need information  describing updates and bug fixes (blog posts are great for this)
  • Don’t forget to mail registered users about updates (*cough*)
  • Enable the site with an open feedback mechanism (e.g. uservoice)
  • Have a strong grasp on analytics to see actual utilization (e.g. number of signups)
  • Have internal metrics as well as external (use Google Analytics) to get a good picture of what users are doing and the conversations about your site.
  • Reaffirmed that cloud computing is a fantastic way to scale super cheaply (AWS/EC2)
  • Automate as much as you can including unit tests, build and deployment scripts, etc. The time savings and reduction in errors pay off quickly.

Become a PersistentFan if you haven’t already, let us know what works/what doesn’t and look for a steady stream of changes. (Any questions why I’m the Mayor at the local Starbucks?)

Hooked on FourSquare

Every morning, I wait with baited breath. Pushing the “Check-in here” button on my iPhone, I peer at the screen, waiting for the thing that will instantly make my day.

It happened a few times, but somehow, I lost it. I can’t explain why, but I want it back.

I want to the Mayor of my local Starbucks again.  “AJ F.” stole the title from me last week and I’m left wondering if I will ever get it back. My wife thinks I’m nuts. Bacon officially labeled me a “junkie”.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

It isn’t the super cool geo-location (which apparently Gowalla does better??), nor the info my social graph provides. The bugs/UI might drive some folks nuts (e.g. I go to the same place every morning, but it never shows up in my list. I have to type out “S-t-a-r-b-u-c-k-s” every time, wait for the service to respond with a list of Starbucks and scroll through to find my local place).

 

 

 

 

 

I can’t explain it … I guess this means I’m a Foursquare fanboi now. And I’m ok with that.

ApacheCon 2009

I attended ApacheCon 2009 which coincided with the 10th anniversary of the Apache Software Foundation (ASF) this week.

Deciding what sessions to attend what tough (So many sessions … so little time) I focused mostly on the following:

  • Tomcat
  • Internet Scalable Architectures
  • Lucene
  • Solr
  • Hadoop/Mahout

Unlike the Cloud Computing conference I attended earlier in the week, there were no vendor presentations, pushing their solution. Instead, sessions were generally led by project committers. This meant that the speaker was very knowledgeable about his/her subject and was able to answer in-depth question off the cuff. Very Gnomedex-like!

Slides of each session are supposed to be up on the ApacheCon website “soon”. In the meantime, I’ve included links to some of the sessions. Unfortunately, some were hand drawn, so no slides :-(

I’m not sure on the date for the next ApacheCon but I would highly recommend attending if you are interested in ASF-related projects.

Cloud Computing Attracts Big Players

The Cloud Computing Conference in Santa Clara, CA (11/2/2009 – 11/4/2009) was well attended and featured a number of companies.

There were a number of large players offering cloud solutions. They are not just dipping a toe in the water, instead many enterprise players are putting significant efforts behind their cloud offerings.

Some of the large players included:

  • Intel
  • Oracle (yes, even though Larry said cloud computing was vapor)
  • EMC (focusing on storage, disaster recovery)
  • Unisys
  • Yahoo
  • Microsoft
  • SAP
  • Sun
  • VMWare

There were a number of cloud vendors, ranging from large to small startups:

  • Rackspace
  • RightScale
  • 3Tera

Amazon did not make an official appearance, although Jeff Barr tweeted that he would be in attendance.

Major takeaways:

  • Vendors smell large opportunity. Agatha Poon of the Yankee Group thought large scale adoption was a number of years out, with adoption varying by sector. Interestingly, she stated that the healthcare sector was more optimistic on cloud adoption, ahead of both manufacturing and finance.
  • Still working on “closing the deal” – many discussions about overcoming “myths” and having to educate potential customers about public/private clouds, security, etc. The cloud has not entered anything close to the mainstream yet.
  • Major selling points revolve around economics (perhaps a sign of the times): pay-as-you-go, no depreciation, etc
  • Cloud portability and service interoperability will not happen in the near term.

Agatha Poon captured the state of the cloud at the end of her presentation quite well:

“Make no mistake, cloud services are still evolving”

Nine Myths of Cloud Computing

Richard Marcello, SVP Unisys, gave the keynote presentation at the Cloud Computing Expo. His talk was entitled “The Time is Right for Enterprise Cloud Computing

The most interesting part of the presentation was a list of the “Nine Myths of Cloud Computing”:

  • Myth #9 - Cloud computing is not new, not revolutionary
  • Myth #8 - All clouds are the same
    The speaker noted that there are many different types of clouds including: public, private and hybrid. He felt that in the long run, most companies would end up with hybrid solutions, running what is appropriate for each type of cloud.
  • Myth #7 - Cloud computing is about technology
    Marcello walked through a slide of Forrester info on traditional data center vs a cloud-based solution. He focused on expense, financial risk and depreciation, noting that cloud computing is also about cost.
  • Myth #6 - Private clouds have no benefit over virtualization
    The speaker felt that private clouds had to deliver self-provisioned capabilities. At Unisys, the average setup time for a developer went from 10 days to 5 minutes due to the creation of a self-service web-based UI.
  • Myth #5 - Cloud computing is not reliable
    Marcello disagreed with this myth, focusing on having a disaster recovery strategy, data security requirements, data reliability (using ‘m’ of ‘n’ redundancy strategies)
  • Myth #4 - Cloud value is only about cost
    Don’t let the improvement in agility get lost in the message
  • Myth #3 - Cloud is not for mainstream business applications
    Marcello felt that cloud computing won't take off until this myth dies
  • Myth #2 - Cloud is inappropriate for compliance-regulated industries
    If architected properly, can address all kinds of compliance issues
  • Myth #1 - Internal datacenter is more secure than the cloud

Overall, quite an interesting presentation; certainly some hype around the cloud, but good list nonetheless.

Facebook Architecture and Scaling

Dare Obasanjo has a great (long) post about the Facebook Engineering Roadshow he attended in Seattle on 10/28/2009.

I particularly liked the level of detail Dare provided in his write-up. He discusses the evolution of the Facebook architecture from a sharded-by-school approach to today’s much more demanding requirements.

 

Dare’s description of how the FB News feed is assembled via their “Multifeed” service is incredible:

Multifeed is a custom distributed system which takes the tens of thousands of updates from friends and picks the 45 that are either most relevant or most recent. Bob updates status to DB and then Scribe pushes this to a Multifeed server which is a cache of the most recent 50 items the user has performed. This caches data for every user in the system. When a user shows up, the user hits an aggregator which takes the list of friends and then each leaf node does some filtering then gets back to the aggregator which then does some filtering and returns story IDs. Then the aggregator fills out the data from memcache given the story ID.

Facebook’s use of memcache is well-known but I wasn’t aware of some of the changes they have introduced including:

  • Ported to 64-bit
  • Migrated protocol to UDP
  • Added multi-threading support

Dare notes that these changes have increased the throughput 5x. Hopefully FB will be contributing these changes back to the memcache project.

The money quote for me was

Huge Impact with Small Teams – Most features and underlying systems on the site are built by teams of one to three people.

Great news for FB that they are able to remain so entrepreneurial and innovative even as the company experiences incredible growth. Not an easy task and something I really miss on a personal level.

 

 

 

 

 

 

 

 

 

 

 

 

Facebook is a large-scale, transaction-intensive service. Learning about how they are working to keep up with the demands of a growing service is fascinating. It is also a great way to leverage what they’ve learned and apply it to your world.