Now that the Big Questions are Settled… Puppet or Chef?

My longstanding quest for the best tool set for rapid, flexible and powerful development (where for me “powerful” includes robust sql database support, strong dictionary / no
sql data support, access to powerful statistics libraries, machine learning libraries, NLP and other libraries, and support for web apps and restful, back end services), I have settled on Python3, MySQL / Aurora, PyCharm + Sublime, NLTK, numpy, flask and the rest.  I am already happy with my productivity, and have recently recognized a need to move from OS-X to Linux for more of the heavy lifting.  Which, naturally, means everything is going to AWS…
AWS has improved in a hundred ways in the last three years, and when I was last certified I thought it was the best thing ever. So January 31 I hope to be re-certified but have already begun to migrate my personal projects and infrastructure to the cloud… I expect this to take months as I interleave it with ongoing development and research projects.

All of which is good, and AWS goes a long way for me toward making infrastructure into software.  But there is another area I want to understand better and apply in my quest for more efficiency, and this raises the question in the title: Chef or Pup
pet? I wont throw in Ansible or Salt as this article does, and based on some of what I am reading, perhaps my Python penchant might argue one pupchef
, whereas my striving to use workplace-relevant tools and approaches across the board my weight my choice against what is optimums for my current daily workload.

Another factor might be how well AWS integrates with either product, which would weigh more heavily than my personal Python needs, as Ruby and other toolsets are likely to be more important to many of my future clients  / employers.

linuxacademy-graphicI also should give a big SHOUT OUT (is that big? ) to James and his team at LinuxAcademy who continue, five or seven years in, to innovate and do a fantastic job of providing top-flight hands-on training for AWS / Linux / Azure devs, sysops and architects.  Fantastic performance for a small firm that obviously has their priorities right!

But back on Chef vs. Puppet, I will find or create a comparison and figure out if either are going to save me cycles, make me more efficient, or just slow me (and my small team) down!




TensorFlow is released: Google Machine Learning for Everyone

2FNLTensorFlow_logoGoogle posted information about TensorFlow —  the release of as open source of a key bunch of machine learning tools on their Google research blog here.

Given the great piles of multi-dimensional tables (or arrays) of data machine learning typically involves, and (at least for us primitive users) the tremendous shovel work involved in massaging and pushing around these giant piles of data file (and sorting out the arcane naming schemes devised to try to help with this problem is almost a worse problem itself),

the appellation of “Tensor Flow” as a tool to help with this is at first blush very promising. That is, rather than just a library of mathmatical algorithm implementations, I am expecting something that can help make the machine learning work itself more manageable.

I suspect that just figuring out what this is will cost me a few days… but I have much to learn.



Fault Tolerance in Distributed Systems

Perhaps we are nearly at the point where saying “distributed systems” is as redundant as “software program” always has been, but for the moment I want to consider how a specific issue is heightened by the nature of modern, asynchronous systems, and that issue is “fault tolerance” generally as well as “cascading failures” specifically.

More and more such issues arise — and I was please to read a particularly lucid explanation of a popular and important design pattern used in many solutions: the Circuit Breaker pattern.  On Martin Fowler’s blog — haha. I was kind of surprised by that — but only because I don’t google interesting problems in architecture and design nearly as often as I’d like.

I can’t add any value to what he’s written here, so instead i will just quote briefly:

The basic idea behind the circuit breaker is very simple. You wrap a protected function call in a circuit breaker object, which monitors for failures. Once the failures reach a certain threshold, the circuit breaker trips, and all further calls to the circuit breaker return with an error, without the protected call being made at all. Usually you’ll also want some kind of monitor alert if the circuit breaker trips.

There are added bits about adding a capability to attempt automatic reset (at some specified interval) and discussions of other real-world refinements (e.g. different thresholds for different sorts of errors), but a hallmark of this sort of writing is that, at least for most of its intended audience, a simple example provided in detail, and pointers to additional kinds of flourishes and add-ons, is really all that is needed.

Courtesy of Martin Flower

Check it out!  And if you googled this topic, doubtless you have read or seen something about NetFlix’ Hystrix, which says on that getHub landing page:

Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.

It is a java implementation; there are other articles linked here and links to alternative Circuit-breaker patterns in RubyJavaGrails PluginC#AspectJ, and Scala listed at the bottom of the Fowler blog post.

Web App Development Frameworks: too many to shake a stick at…

I almost decided to simply add a series of “update” links to my prior post on my quest to understand the current state of “modern web app development”, and in particular, tools, frameworks and environments of choice.  Now as ever, many technologists live in their silos, and while the best and most enlighten attempt to glance across at what others are doing, keep up with new ideas and tool kits, the reality is that most top-tier, hard working, delivering-in-a-crunch developer / architects are almost always too busy to do as much of this sort of thing as they’d like.  And while I am not one of those, I am also a busy person with many balls in the air, ideas kicking around, prototypes-in-process, apps-in-process, with a real job and various research projects…

My prior post focused on the relative popularity of various “platforms” and javascript frameworks, without any real regard for which things work together, which compete, or many other factors.  In short, it is a “dog’s breakfast” of random observations with a good link or two to more coherent material.   Arguably, not very useful.

I will rectify this with future posts about specific combinations of web app development tools and platforms, where there are available tutorials and low barriers to adoption.  And after about five of these, i hope to have come up with a more coherent scheme for comparison.

So How Are Modern Online Services / Communities / Properties / Products Getting Built Now?

If you have deluded yourself you have a great idea and a solid plan for its realization, and it involves “the web” — and how could it not? — How to start, and what to use?

Sometime it is obvious.  You need AWS. You need a back end. You need a front end… and of course things start getting tricky.   Devices or Browsers?  Do-most-everything tools (Adobe this, Oracle that, Dot-net-whatever, etc.) or best of bread (mix iOS with RoR, or run with Java/Grails and Some javascript framework on the back end? Good o’l LAMPP??? Cold fusion anyone? Is Haht still around?

At some point you might actually be thinking “which JS Framework is the one to bet on?

Other than my sympathy, i can also offer this neat URL to tell you so much you didn’t already know about who is using what:    click here to see the google trends view of Js frameworks… now.

Is this fair? is it right? is it that simple? I stared at this for quite a while, before realizing what was missing (I think — this is not an area I know much about!) Node.JS anyone? Then change the timeframe, change the regions, etc etc. Does it really mean anything? I suspect it does… but then, not really so much that is useful in the particular even if interesting in the general case.

Now back to those other big questions — RoR? Python for everything? Go?  It seems Django / Phython or RoR coupled with JS framework might be good if, say, your UI needed to show markup language effects in a WYSIWYG editor for some reason…

So much to investigate… I think I will just ask the experts!


Update:  Some interesting data on what high growth startups are using:  here  (based on Anglelist data, so…).

Cloud-based Design with Amazon Web Service (AWS)

I’ve made an update to my evolving140701PoCbSD-cover draft document that is intended to convey both fundamental principles of modern solution architecture and a sense of how these principles, once implemented with highly customized and expertly engineered solutions, can now be implemented with foundational “building block” elements that are part of most major “cloud platform” service providers’ offerings. My examples use AWS, or Amazon Web Services, currently both the most advanced and the most used cloud provider.  [An earlier post included comparison information between Google Cloud Platform and Amazon Web Services — see here.]

The latest draft is located here; comments / thoughts are welcome.


Modern Application Architecture

Modern Application Architecture:

Principles for Cloud-based Solution Design

I’ve been working on a small white paper; a monograph of sorts.  But then that overstates it.  I am I would say writing a short paper to present and explain the primary elements of “good” solution architecture in the (current) modern era. The scope of this perspective on what is “good” is particularly for mission-critical, production applications supporting real programs or activities that are expected to work reliably and continuously with minimal handholding.  It is informed by the advent of mature but still advancing cloud-computing building blocks, and by in increasing need in every enterprise for solutions where economic scalability is often the primary “reach” for solution architectures that must also serve the lessor gods of “high performance”, “high availability”, “graceful degradation”, “seamless failover” and “automatic recovery / restart”.

At this point I’d say the primary elements are:

  • Memory and Caching
  • Parallelization & Partitioning
  • Data Replication
  • Event-Driven Processing
  • Distributed Processing

Of course, all of these are inter-related.  In fact “partitioning” is at the heart of everything, and intertwined with “parallelization”.  And most of these principles apply at various levels — from the micro to the macro.

And I have thought about organizing the work to show how these principles were employed in 1999-2000, at great cost and effort, and how they can be realized using cloud-based (AWS in this instance) building blocks in 2014, at almost no cost (or no “incremental” cost, and in some cases, due to elastic scalability, lower cost).

I am not sure if this notion of “how we did it then” and “how you do it now“, on a design-principle-by-design-principle basis, is going to really illuminate the problem and the brilliance of the current solution set building blocks as brightly as I’d hoped. But so far I think the problem is my writing.

In fact it needs so much work I am not sure I will ever get to it again. So I will link to it here (pt1 and pt2) and hopefully fix it or take it down before too long…  [new version linked in subsequent post]