Google App Engine: Embrace the Constraints

Google App Engine
Google made a big announcement Monday night that has the web development community talking. They announced and released a preview version of App Engine, a set of tools that lets you quickly build web applications and deploy them to Google’s infrastructure for instant scalability. I want to talk a bit about why it is important to web entrepreneurs, and why Amazon’s web services division doesn’t need to be worried – yet.

What the Google App Engine Is (In Plain English)

At its core, App Engine a limited, public interface to the technologies Google uses to power most of its web applications, including search, Gmail, and Google Maps, and many others. These technologies include the Google File System and Bigtable, the company’s home-grown software that allows their products to run reliably across thousands of distributed computers for millions of users. In other words, App Engine lets you use Google’s hardware and software to create web applications that can seamlessly handle a handful of users one day and millions the next.

It you’re not familiar with web application development, you might not understand the difficulty in getting a program to scale – to perform as well under high loads as it does with just a few users. Because web apps are usually developed on a single machine, deploying and running a production version to one server is extremely easy. It’s when you need to start adding more hardware resources – web servers, database servers, load balancers, etc. – that things start getting complicated and costly. It’s doable, for sure – just look at popular sites like Facebook, Youtube, Myspace, and Google itself. But, it requires a lot of expertise and a lot of money to do things right. For these reasons, web start-ups wisely hold-off on optimizing their applications until the demand is actually there. Twitter, facing constant scaling problems over the last year, is a perfect case study. So, to developers, App Engine means they don’t need to worry about these scaling problems – at all.

It’s All About the Constraints

The way Google’s distributed infrastructure and App Engine are designed creates some pretty strict constraints for applications running on them. This automatically rules out a bunch of programs that are simply too complex to run on it – mainly ones that require access to the operating system and those that need processes to run independent of users’ interaction with the web app (think scheduled tasks, long running scripts, etc.). The High Scalability blog has an excellent overview of the technical side of App Engine:

Everything that could have tied you to a machine is tossed. No disk access, no threads, no sockets, no root, no system calls, no nothing but service based access. Services are king because they are easily made scalable by load balancing and other tricks of the trade that are easily turned behind the scenes, without any application awareness or involvement.

What isn’t scalable about AppEngine is the scalability of the complexity of the applications you can build. It’s a simple request response system. I didn’t notice a cron service, for example. Since you can’t write your own services a cron service would give you an opportunity to get a little CPU time of your own to do work. To extend this notion a bit what I would like to see as an event driven state machine service that could drive web services. If email needs to be sent every hour, for example, who will invoke your service every hour so you can get the CPU to send the email? If you have a long running seven step asynchronous event driven algorithm to follow, how will you get the CPU to implement the steps? This may be Google’s intent. Or somewhere in the development cycle we may get more features of this sort. But for now it’s a serious weakness.

On the other hand, these exact same constraints make you develop in such a way that your code can instantly go from running on a single development machine to being served from thousands of servers from Google’s farm of computers. Some developers can’t, or won’t, live with Google’s forced limitations, either out of principle, or because their application requires a more complex environment. But for the wide range of potential web apps that do meet the App Engine requirements, it provides a unique opportunity to launch a product without worrying about how to grow the back-end as demand increases. Google does it for you – automatically.

What You Get as a Developer

  1. A Python Application Environment: Currently, you can only develop and deploy apps to the App Engine that are written in the Python programming language. Python is heavily used by Google for its products and internal applications, so this limitation only makes sense. There are going to be a lot of PHP developers turned away by this, but Python is a wonderful and powerful scripting language that can be picked up easily by anyone familiar with Ruby or Perl. It also means you can develop using the excellent Django Web framework – a huge plus. Google also provides a free App Engine software development kit that, when downloaded and installed on your computer, locally simulates your app running on Google’s systems. Then, when you’re ready to deploy your application to App Engine, a single command does all of the work for you. Very slick.
  2. A Secure Application Sandbox: This piece securely isolates your program from all others running on Google’s servers, and enforces the App Engine constraints by providing only limited access to the operating system. This does prevent you from calling command line scripts and other programs, but means that your app is independent of hardware, operating systems, and other typical dependencies. That means it can run as easily on 1,000 web servers as it can on one.
  3. A Reliable Datastore: App Engine provides a simple method for storing structured data, which is based on Google’s BigTable distributed storage system (video link). It sacrifices many of the features of traditional relational database systems like MySQL, Microsoft SQL Server, and Oracle for speed, simplicity, and reliability. It is similar in design to Amazon’s SimpleDB product, and shares many of its advantages and disadvantages.
  4. Google Account Integration: App Engine makes it extremely easy to use Google Account logins to authenticate users to your application. You can roll your own method if you like, but this offers a quick and secure way of doing it for both the developer and end users (they won’t need to create yet another account).
  5. URL Fetching and Mail Services: Much as you’d expect, the mail API gives your application a way to send out email messages. The URL Fetch API is simple, but key. It’s how applications running on App Engine can consume Atom/RSS and web services from 3rd parties to extend their functionality.

Why App Engine Isn’t Amazon Web Services

Amazon Web Services
As I hope I’ve made clear by now, there is a certain sweet-spot of web applications that can run within the current incarnation of App Engine. The environment isn’t a general-purpose utility computing platform where you “rent” computing resources based on usage. On the other hand, the Amazon Web Services are exactly this. Each of Amazon’s services work well together, but are separate so that you can pick and choose which ones to integrate into your program. If you want a cheap way to store files, Simple Storage Service is for you. If you just need a simple way of storing structured data, you can pick-up SimpleDB.

Amazon’s Elastic Computing Cloud service is the one that people are wanting to compare App Engine to, but that is, in fact, completely different. The EC2 provides a virtualized infrastructure that lets you build your environment exactly how you want. You upload your own Linux image, configured with the settings and applications you want. This could be a web app, but it could also be anything from transcoding audio or video files, performing scientific data analysis, or batch processing tax returns. There are absolutely no limitations to what you can or can’t do, with the one caveat that running databases on EC2 is not recommended since the server instances are transient by nature.

The upside to Amazon’s web services is flexibility – you can pick and choose the ones you want to use. The downside is complexity, particularly with the EC2 service. Sure, you have the ability to add or remove server instances on demand, but it’s your responsibility to monitor this situation and create the mechanism for automating the process. Software such as Scalr can ease this burden dramatically, but I imagine there are a fair number of really small companies that would prefer to never have to even think about the scaling problem.

Summing it Up

App Engine isn’t going to be the solution every web developer is looking for, but I believe that’s actually a Good Thing. By focusing on a narrow use case and keeping things simple, Google will end up creating a better experience for developers. For developers requiring something with greater flexibility and control, the Amazon web services and other competitors remain extremely good options.

Also, remember that the App Engine is a preview release. That means Google wanted to get it out the door and get feedback as soon as possible. It also means that they’re willing to add or change features to meet the needs of developers. So, some of the current pain-points might be addressed in the near future.

Further Reading

Leave a Reply

Your email address will not be published. Required fields are marked *