How to Choose System Monitoring Tools

There are dozens of monitoring solutions available in the marketplace. Beyond commercial tools, there are many custom, smaller, or in-house tools. For example I am the prime developer of a tool called Avail which is developed at Pythian. Since we are primarily a managed services company, this tool needs to run on every imaginable platform in existence. Beyond this, since we specialize in database and database applications, we need to be able to reliably and flexibly monitor a huge number of database products. When looking for your own monitoring needs, there are many things to consider.

What is your budget?

Unfortunately monitoring is often the first item to be cut when dealing with budgeting. Some managers think that monitoring doesn’t contribute to the bottom-line of the company, but I would argue that they’re wrong. Monitoring is a vital part of providing a service that is reliable and can be audited. Monitoring tools can assist in preventing security incidents through patch and compliance monitoring. Monitoring tools also assist in narrowing down service performance issues, providing SLA verification, and ensuring that your current infrastructure is meeting your needs. When 3 seconds of waiting decreases customer satisfaction by about 16%, it’s important to ensure your service is reliable and responsive every time. I actively encourage customers to consider monitoring at development time – it’s significantly easier to instrument new configurations and code than to retrofit and find funding when you’re about to go live.

There are many commercial solutions available in the marketplace at many price points. There are SaaS solutions, on-premise solutions, or depending on your infrastructure, integrated solutions. Many tools offer per-CPU or per-Server licensing, but an increasing of SaaS providers are offering different billing metrics including by the hour, by the check, or by the instance. Beyond commercial, there are also many free solutions available for different infrastructure setups. There are pros and cons to each infrastructure layout and each tool, which I will explore in depth in a future article.

What are you trying to monitor?

Traditionally servers were owned and housed in a data center by an organization. This typically meant a fixed number of assets with total ownership. As such, beyond your applications and services, you frequently had to monitor the health of your physical systems as well as the load and health of your applications. Things like disk state, temperature, configuration, and load were critical to ensuring a high quality of service. Virtualization abstracted away some of these issues, allowing load to be balanced and services to be shifted off troublesome hosts.

Many companies are now choosing cloud solutions for their applications, be it private clouds or public cloud utilities. This changes the nature of monitoring – no longer are you managing fixed assets, you are monitoring a service as you scale with demand. Virtual servers and application instances are spun up and down on demand, and no longer are you concerned with the health of a specific physical machine or even virtual host. This dynamic nature means you need a monitoring solution that can be deployed automatically and monitor the application and service metrics that you care about. Some cloud provides such as Amazon have built out services such as CloudWatch to monitor the status of your service. But these tools tend to scale in cost with your service, and don’t provide fine-grained integration that traditional monitoring tools have enjoyed. Some services like NewRelic have addressed application performance and reliability monitoring, but have high cost and high integration effort.

What are your monitoring metrics and goals?

Depending on the design and nature of your business there are many things to consider when it comes to monitoring. There are also three high-level metrics to consider when designing a monitoring solution for your systems: severity, risk, and preventability. Severity relates to the the impact that an outage would have on your business. Risk relates to how likely an issue is likely to occur. Preventability relates whether an issue can be prevented before it causes an outage. As a result, designing an effective monitoring solution can be extremely complex. When evaluating monitoring tools you need to consider what kind of issues can be captured, and the cost in terms of time and software customization. Some things to consider when evaluating the need and complexity of a monitoring solution:

  • Is your service online and responsive?
  • Are your SLAs being met in terms of backups, service availability, response times?
  • Are your security needs met in terms of patching, vulnerabilities, and configuration?
  • Are all key aspects of your service or system being monitored?
  • How are your thresholds configured and reviewed?
  • Do you have proactive monitoring in place to catch issues before they cause a failure?
  • Are you monitoring all aspects of your system?
  • Does the tool allow for auditing and historical analysis?

What is the difference between agent and agentless monitoring?

Traditionally monitoring a system meant installing an agent on a system, and having that agent configured to monitor a number of metrics, potentially with custom scripting. A popular alternative to this is agentless monitoring. A central service (SaaS or otherwise) would connect using established protocols such as SSH on Unix and WMI on Windows, and run queries and system commands against the system. The data would be parsed and managed remotely. The advantage of this approach is you would only need to configure permissions and connectivity to a system. While remote access may be ideal for configuration and deployment, the need to transfer data remotely may violate compliance with regulation such as PCI DSS or HIPAA. Other limitations may be platform dependent – you may not be able to access all application features and information on all OS platforms and security breaches could result in large data breaches.

What kind of integration do you need with your applications?

You may also want to consider the needs of your organization. Application instrumentation and monitoring may be an essential part of your daily operations. Tools like NewRelic can integrate with your application along with off-the-shelf software to provide request and resource-level monitoring. Tools like Nagios an be customized to integrate with your tools as well through custom scripting. Sometimes it is sufficient to monitor the system and high-level components such as the database and core services running on the system.

RESTful API Best Practices

The past few days I have read a number of articles outlining what they feel are best practices when designing RESTful APIs. I have put together some of the most common points that I feel directly impact good API design. This post is a work in progress, and will evolve as I dive deeper into the world of good API design.

Nouns, not Verbs

A good API starts with its structure. The consumer of most APIs tend to be developers, so consistency and a nice interface goes a long way. The first step in good API design is having a good resource naming scheme. Too often people intermingle verbs with their resource names. A good API should be noun-based, and the verbs should be represented by HTTP methods.

Bad: GET /api/getProducts
Good: GET /api/products

API Versions

API versioning can be a point of contention. There are good and bad uses, and there are also appropriate and inappropriate time to make use of versioning. I feel that a well-designed API should not require a version, and that it is beneficial to maintain a consistent API through versions. That said, there are many cases where versioning is necessary such as major architectural changes, or paying customers who may not be able to keep their applications in lock step with your own. If you are going to make use of versioning in your API, do so in the URL and not in the header. All versions should be explorable, making headers a non-ideal option. Minor revisions can be expressed using special headers, and can also be expressed by adding new fields while leaving existing fields unchanged.

URL: /api/v1/products/1

Plural or Singular

I have no strong feelings on plural versus singular naming, however the communal wisdom dictates that all resource locators should be plural in name. Regardless of what format you choose, be consistent throughout the application.

Bad: /api/products/1, /api/item/2
Good: /api/products/1, /api/items/2

Nested Resources

Nested resources should only be created when there is a clear relationship between the two resources. It can be used to package convenience methods such as commonly made queries, and it can be used for filtering or sub-selection. The import concept to note is that nested routing should be refining a resource always.

Bad: /api/song/1/album/2
Good: /api/album2/song/1

GET/HEAD/POST/PUT/DELETE

There is not much to say here – use the HTTP verbs whenever possible! They make your code easy to read and API easy to consume. There are some caveats which must be considered, such as proxies which prevent requests other than GET or POST, but for these, make use of a convention of setting a POST header which your software can decode and route appropriately.

Action Method
Retrieve an existing resource GET /api/products/1
Update an existing resource PUT /api/products/1
Create a new resource POST /api/products/1
Delete an existing resource DELETE /api/products/1

Status Codes

Whenever dealing with resources, always make sure you return an appropriate status code that follows standards. Most individuals are used to returning HTTP 200 OK or HTTP 400 BAD REQUEST  for their applications, but for not much additional cost you can provide a lot more semantic meaning in your application.

201 CREATED – Whenever a resource has been created. Include a URL to the new resource.
202 ACCEPTED – Let the user know their resource has been accepted for processing.
204 NO CONTENT – Let the user know the request succeeded, but the server has nothing to say (i.e. responding to a DELETE request).

Error Codes

Continuing the discussion about status codes: we also need to tell the user when things didn’t go so well. It is particularly important that your format your response consistently, correctly, and provide as much detail as possible. Often times, this is the only way that developers will be able to interact with your system and get meaningful information back.

400 BAD REQUEST – Malformed request payload such as invalid XML or JSON.
410 GONE – When a resource endpoint has been deprecated.
415 UNSUPPORTED MEDIA TYPE – When a content type was either incorrectly provided or the server cannot handle that content type.
422 UNPROCESSABLE ENTITY – When the request fails server-side validation.

Payloads

Payload type doesn’t matter in a real sense. Almost every framework and language will support the most common options: JSON and XML. I personally recommend choosing JSON as XML is unnecessarily verbose and has well known parsing problems. If you are going to support multiple types, I recommend going the Rails route and appending the type to the URL such as .xml or .json. This allows for browser explorability and simple visual inspection of the expected payload types.

Security

In this day and age, it is unacceptable to not be using SSL. The overhead cost of SSL is trivial, and the security it can provide is invaluable. Certificates are cheap, and allow for the secure transmission of tokens, authentication, and of course your data.

Night of the Living Dead – Tombstoning Your Code

As code bases grow in size, so does the amount of cruft accumulated from years of refactoring, improvement, or feature abandonment. Dead code can become troublesome to troubleshoot, introduce potential security risks, and confuse developers who continue to work. In an ideal world we would do a simple grep the code base to find references to its use, and if none come up, prune the code. Unfortunately in the real world, this breaks down pretty quickly as people can do some pretty tricky things that make code usefulness non-obvious:

  • Dynamic code generation – generating the path or dependency names automatically at run time (getters/setters, importing name spaces)
  • External dependencies – projects external to the code base that may depend on some legacy components

There is an interesting idea from David Schnepper from Box who talks about the idea of “tombstones” in the code base. The core idea is that all methods in a code base be instrumented with a simple action called a tombstone such as “Tombstone(‘2014-10-31’)” that are both logged and automatically removed from the code when the code path is determined to be live. Over time, tombstones of dead code will remain which indicates that it is likely safe to remove the zombie code from the code base.

While this is a fantastic idea for deprecating dead code, I don’t think that this is ideal for a production system. There are a number of things to consider when deprecating code in this manner:

  • What is the performance impact of adding a tombstone to my application?
  • How do you determine an acceptable tombstone purge date? A month? A year?
  • What risk does this introduce in third party integration or other dependent services?
  • How do you handle cross-project dependencies?

Applied correctly though I think that adding tombstones is an interesting idea, one that perhaps could be automated with solid code instrumentation.

All Aboard Ruby On Rails

Ruby On Rails

Last night I posted the code for the next blog installment which is using the Ruby on Rails web framework for building your blog application. You can find this demo code in the new GitHub repository at http://github.com/deanpearce/MyBlogRails. And please do look at it, play with it, and tell me what I’ve done wrong as usual :). I have spent time over the past few weeks familiarizing myself with Ruby and how its gems, such as Rails, extend the functionality of the language. For those of you who are new to Ruby on Rails, it’s important to understand that Rails is a Ruby gem, meaning that the gem extends the language to make Ruby a friendly language for web development. As per usual, I am assuming that you are caught up in at least the fundamentals of the Ruby programming language, and have followed the installation guide (for Linux as usual :)) at http://wiki.rubyonrails.org/getting-started/installation/linux. The process is rather straightforward, and even easier on Ubuntu as most (if not all) the packages can be installed using Aptitude to get the development process going without needing to wait for builds.

Overall, this iteration of the project got easier, since the HTML::Mason and Rails ERB views are very similar in construction. It became an exercise in substitute the correct language code while keeping existing logic in place. For my own sanity, I added a users controller so I can control accounts through a web UI once you are authenticated (hint: this will be a lead up to a potential second round, integrating user accounts and external services such as Gravatar :P). Compensating for the time I spent in developing the basic template for Catalyst (a grand total of 3 of the 35 minutes), it works out that from scratch, development of a blog application took about 20 minutes in Ruby on Rails, which is stunning! But that isn’t without a few pitfalls that I will go over later in the article.

What I will be covering:

  • Generating your first Rails Application
  • Framing the structure using scaffolding
  • What comes built-in to Rails (and why the way I used sessions isn’t the brightest)

The Installation Process

This section is pretty short, and overall I have no complaints! Using Aptitude to install on Ubuntu, the process from opening the Wiki to having the default “You’re Riding On Rails” page was maybe 10 minutes, significantly faster than Catalyst. And because of the way we will be playing with Ruby on Rails, we won’t be doing our database creation! Well, we will be using a MySQL database, but we will use the rake tool to handle the complex parts of the database creation and generation for us.

Creating the App

Whoa! We have already cut a lot of the time out from the Catalyst application. The install process was a few simple apt-gets (in Ubuntu, a few simple makes everyone else) and we can skip the database creation phase! This is great for us since we are trying to shave minutes off development. In the real world, one would want to plan all of this out beforehand, such as planning your controllers rather than your database, but for our purposes we will do this ad-hoc. Switch to your favourite development directory and type:

  • pearce@marvin:~/Projects$ rails -d MyBlogOnRails
  • create …

Great, the application was generated! You might be wondering what the -d flag is. By default RoR (Ruby on Rails) uses an SQLite database, the -d flag tells it to build as if we are going to use a MySQL database. The flag may be -d or -D depending on the source you install and version, so keep an eye out for that. Very similar to Catalyst, and as we’ll discover, pretty much the same structure all the popular web frameworks use. Now that we have an application, let’s start creating with it.

Configuring the App

The first thing is we need to do is give the app some credentials for the database. Ruby on Rails differentiates between development, testing and production. The names are pretty self-explanatory and for our needs we will be using the generated default of development. Switch to your project directory MyBlogOnRails and edit config/databases.yml:

  • pearce@marvin:~/Projects$ cd MyBlogOnRails
  • pearce@marvin:~/Projects/MyBlogOnRails$ vi config/database.yml

development:
adapter: mysql
database: myblog_development
username: root
password: ****
host: localhost

This will tell RoR that our database, which we will call myblog_development, can use those credentials at that host. There are sections for production and testing also in the file, and you can edit those at your leisure, but for our purposes we will only be using development. That’s it for setup, nothing to make or build or chmod. Let’s continue on to Grand Central: the implementation of the model, view and controller.

Grand Central: MVC Implementation

If so far your RoR experience has been good, this is where things may switch tracks for some or go express for others. There are some tips and tricks that are necessary for a smooth experience that can be read in the Ruby on Rails Wiki such as naming convention of controllers and types to specify when generating the controller and such. If you have this knowledge in your head, Rails will turn your development express. If you’re like me and don’t like reading documentation before you start hacking around (though I always end up reading it, and improving what I do :)) then this might be the thing that derails your project (I’m on a roll with these puns :D). The big holdups for me were the following:

  • When using the generator name your controllers in the singular (it will automatically handle making things plural as needed), else you will encounter “dependency missing” message hell.
  • The documentation on what is going on when things go wrong at this point is horrible! Nine out of ten sources (sites, docs, comments) say “oh well regenerate your app” and that’s it!
  • For in-depth projects you will want to plan your database either independently and write your own controllers, or plan your controllers (down to the order things will be created).

Once you get over that, it’s fairly smooth sailing. First attempts at it were causing premature manual baldness and I am really surprised that for such a large and vibrant community these basic problems aren’t explained more often (and in ways that are easy to search for on a search engine).

Controller

So now let’s generate the controllers that we will need for this project:

  • pearce@marvin:~/Projects/MyBlogOnRails$ ruby script/generate scaffold post title:string body:text
  • pearce@marvin:~/Projects/MyBlogOnRails$ ruby script/generate scaffold user username:string password:string email:string

At first this may look a little confusing, why are we telling our controller what we need to include, and what is this scaffolding? This will be explained after we finish generating everything 🙂

And that’s it for generating code! What, I bet you thought it was going to be more. Scaffolding is a cool tool included in Rails to generate most of the code needed to get a full MVC application up and running. It creates the model, view files and the controllers all in a nice setup for you to hack away on. For the advanced developer, or very custom components it isn’t too helpful, but for beginners its a great way to see a tangible application, right now with not one line of code! We really are slowly coding ourselves out of jobs aren’t we :P. If you are going to be generating the model on your own, you can add the parameter –skip-migrations to the generate script, which will still generate all the code for you, but running db:migrate will not attempt to create the tables that you need. One important thing to note is that if you want the default action to be the posts controller, you will need to manually edit the routes.rb file in the config directory map.root :controller => “post” right before the end directive in the file, and you will need to delete the default index.html.erb template. This will ensure the root action is the post controller and that you won’t accidentally run it with the index.html.erb file.

The next big step will be to actually create the database and tables for our application, and now that our model is defined, we can use the rake tool included in rails:

  • pearce@marvin:~/Projects/MyBlogOnRails$ rake db:create
  • pearce@marvin:~/Projects/MyBlogOnRails$ rake db:migrate

And if we’ve given the correct permissions to our database user we should get messages saying it has created the database, and a message for each table it creates. As of now, we’re done! Well, not yet 🙂 but if you want to see your RoR app, you can run:

  • pearce@marvin:~/Projects/MyBlogOnRails$ script/server

Which will start a default Mongrel server on port 3000. Navigate to this and you will get the scaffold generated interface! This is all so exciting, a fully running app that you can technically “post” to with no code at all. But as you can see, no security or style either. Let’s take it to the next step and look at sessions and creating a controller for logging in:

  • pearce@marvin:~/Projects/MyBlogOnRails$ ruby script/generate controller user login

This creates our new controller along with the login action/views. Notice we are not doing scaffolding here as the implementation will be done in pure manual code and can be found in the GitHub repository under the controller and views of the same name. In principal we want to do the same thing as the Catalyst version: only allow posting when logged in and a basic login/logout system that just gives us access to the new post page and logout button.

The big helper here is the built-in session management provided by Rails. This means no fiddling around with authentication like in Catalyst, we can immediately use the session call from within our application like <%=session[:username]%>. This is exactly how it was implemented in the blog application, and while it seems good at first, you realize that we depend on the session not being forged and solely on the username :P. In a real system this might not fly as reasonable authentication, but for our purposes it’ll be like Fort Knox. Just as easy as we can set a session variable using session[:key] = val we can use reset_session to erase our session. I won’t go into code detail as I’m assuming you know at least a tiny bit about Ruby, or you have checked out the project and investigated the controller_user.rb file in app/controllers. Once you rework your templates using Ruby syntax such as @post.each do |post| and setup your session management you will be good to go.

Final Thoughts

Once again, it amazes me that it took far long to write (and find time and motivation for :P) writing the blog about the application than it did to actually write it. Ruby on Rails does one thing and does it exceedingly well: turns ideas into applications quickly with minimal code. Like any framework though, once you get more complicated than the basic single table layouts and non-scaffold actions things get a bit more complicated. Ruby on Rails sped things up by allowing me to skip figuring out where to place the model and the views, and allowed me to skip some basic templating and routing, since this was all automatically generated.

Pros

  1. If I thought Catalyst was wickedly fast at prototyping, this is like that star being hurled across space at 2.5 million KM/h. Plain and simple, it’s fast.
  2. Routing (chaining and paths in Catalyst) is cake as long as you don’t need to make modifications to the default values. Things get a bit more when you have to edit the routes.rb file, but the advantage is in that file you can see all the actions and controllers and exactly what they do. I like that, but I also like to know what the controller is doing in the actual controller file. So pluses and minuses to each system.
  3. Default templating is mind numbingly easy, and since HTML::Mason and the ERB files (along with the templating in Cake and JSP pages) they are all very similar in structure. In fact, with simple language replacement, you could migrate templates across systems with almost no work at all. And the scaffolding in Rails allows me to not even need to know how to make a default template (though I do) but just build on what it generates.
  4. Gems are great, I mean Rails, much like Catalyst, is just a module.

Cons

  1. Documentation for anything before actually writing custom controllers, and modules and such is a nightmare, they have great installation guides and how-to walkthroughs online, but anything that sways from the path, or funny errors because you forgot something minor are hard to find documentation for from my experience. Also, the default course of action most people provide is “oh well regenerate your application”, which in the case of being like 3 command line entries in isn’t bad, but let’s say something went awry once much deeper into the project, regenerating/restarting is unthinkable.
  2. The gems are a great feature, but the downside is they are harder to find and not often organized into just a repo. Some are built and from Aptitude, some are from individual sites, there are lots on RubyGems.org, but it just doesn’t feel as well organized. Maybe I’m just too new to see the elegance of the system, or I missed Gems 101, but it doesn’t feel as solid as Perl in that sense.
  3. Mongrel is nice but when the default suggestion is to load balance it I start to question performance a bit 🙂 much like Catalyst, memcached plugs into Rails easily with Cache.put and Cache.get, so it’s probably a good idea to integrate that into your code, and to anticipate using Mongrel Cluster on Apache (or some other httpd) for maximum performance. The upside is this allows for very easy distribution across servers.

So that’s it, I might have some updates to the post in future weeks, or after feedback. As usual, leave a comment below, or email me and I will respond to you. I will not be posting packages in my posts anymore in favour of uploading them to GitHub or to my local repository. Next on the list is CakePHP, which I look forward to as PHP is my secret addiction and I want to finish playing with it and put a post together on that. A friend of mine suggested that I add Django, which is a Python web framework to my list of things to try, so that has been added to the plate for after CakePHP!