Category: Uncategorized

Thoughts on Multitenant Microservices

I have worked on SaaS and multitenant based applications. I have segmented application tenants in the database layer at the row, table, and schema levels. Also, done separate databases for each tenant. Each strategy had its pros and cons, but it only addressed data segmentation and I still had to deal with logic segmentation for each tenant.

When a tenant customer wants different or custom functionality how do I segment the logic in such a way to give the tenant what they want without affecting the other tenants. How do we meter and bill for logic? Complex “if” or “case” statements, reflection, dependency injection…? All a bit messy in my opinion.

Having made the leap to microservices we now have the option of separate services per tenant. In the UI layer each tenant can have a different UI that encapsulates the UI’s structure, layout, styling and logic for the tenant. The UI can also have configurable microservices. This is just a list of endpoints that define the microservices necessary to drive the UI. During on-boarding and on an administrative configuration page, tenants can define the functionality they want to use in place of or along side the default functionality by simply selecting from a list of services. We can query the service configuration and monitor service usage to provide customized per tenant metering and billing.

This is not much different than the plug-in strategy you see in content management systems like WordPress and Umbraco. This is just at a different layer of abstraction. Is this better than the other logic segmentation strategies? I don’t know I haven’t done it yet.

Am I excited to try it? Hell yeah. Will I fail while trying it, I hope so because I can learn some new tricks. One thing proper microservices provides is an easier way to reason about an application in bite sized chunks. Also, with end-to-end automation it is easier to experiment. We can fail often, early and fast, fix it, and repeat until we get it right. So, I think it is going to be fun, in a geeky way, to figure this out even though thinking about using GraphQL muddies the waters a bit, but that’s another post.

If you have done multitenant microservices or are interested in doing something similar with microservices, let’s talk about it :).

In SQL Null is not a value… not a value!

I have been spending a lot of time fixing SQL Server database errors caused by stored procedures attempting to compare null. If you don’t know, in SQL:

NULL = NULL is false

NULL <> NULL is false

Null is not a value. Null is nothing. You can’t compare nothing to nothing because there is nothing to compare. I know you can do a select and see the word NULL in the results in SQL Management Studio, but that is just a marker so you don’t confuse empty strings with NULL or something.

If you need to do a comparison on a nullable value please check that shit for null first:

t2.column2 is null or t2.column2 = t1.column2

t2.column2 is not null

Also, if you try to be smart and turn ANSI_NULLS off you are going to be hurt when you have to upgrade your SQL Server to a version that forces ANSI_NULLS on (it’s coming).

I have been guilty of comparing NULL and saying, “it has a NULL value.” Now that I am having to fix scripts written by someone who did think about NULL, I wanted to rant and hammer this point home for myself so I don’t cause anyone the pain I am feeling right now. Null is not a value… not a value!

.

Where is your logic?

RANT

I hate logic in the database. It’s hard to automate testing, hard to debug, hard to have visibility into logic that may be core to the success or failure of an application or business. Some of the worse problems I have had to deal with are database related, actually almost all of the worse problems have been linked to the database.

I am in love with the new movement to smaller services doing exactly one small thing very well. I think the database should persist data… period. Yes, there are times when it just makes sense to have logic closer to the data, but I can always think of a reason not to do it and it always goes back to my experiences with database problems. It’s been a love hate relationship, me and databases.

I’m not a DBA and I don’t have the reserve brain power to become one. So, to help my limited understanding I shy away from anything that looks like logic in my data layer. Call it lazy, naivete, or not wanting to use the right tool for the job, I don’t care. If I’m in charge get you shitty logic out of the database, including you evil MERGE statement and the current bane of my existence :).

KISS Your Big Data UI

It’s been so hard to blog lately. Mostly because I don’t have time to edit my posts. I guess if I wait until I have time to wordsmith better posts I’ll never post, so here is one that has been sitting on the shelf in all of its unedited glory.

What Qualifies A Developer To Talk About UI Design

I’m not qualified, I am not a designer. I haven’t done a lot of posts on the subject of UI design, but with the big push to big data, real-time streaming analytics, and IoT, I thought that I’d put a little thought into things that I would think about when designing a UI for them.

I started my tech career designing websites and desktop applications for a few years. Although my customers were happy with my UI designs, it’s not my thing and I don’t think I am good at it. Yet, I have been on many application teams and have had to work with many awesome designers. I believe that I can speak on UI design considerations from my 16 years of doing this. What I have to say is not gospel. I haven’t searched this stuff out on Bing like I do with engineering problems. Many UI and usability guru’s will probably crush me if they read this, but I’m not totally clueless when it comes to UI’s.

A Dashboard for Big Data

When dealing with designing an administrator’s dashboard for IoT device sensor data or maybe any big data application you should probably focus on making the right exceptional conditions highly visible and showing actionable data trends with the ability to drill down for more information and take necessary actions. The most important thing is to alert users of potential problems and anomalies that may indicate pending problems while providing some facility for taking action to investigate and mitigate problems. Just as important is being able to identify when something is going well because you want to learn from the successes to possibly apply the knowledge to other areas.

The UI should assist the user in being proactive in addressing problems. This is true if you have one device sending sensor data or a fleet of them. Granted there are differences in design consideration when you start scaling to 100s or 1000s of devices, but depending on the goal of the device, the basic premise is you want to make certain conditions identified by sensors up front and in your face.

KISS Your UI

When you have 100s of messages flowing from sensors compounded by multiple devices, a pageable grid of a hundred recent sensor messages on the first screen of the UI is useless, unless you are the type that enjoys trying to spot changes while scrolling the Matrix. The dashboard should lead you to taking action when problems exist, help you learn from successes, and give you peace of mind that everything is OK. The UI should help you identify potential areas to make improvements by uncovering weaknesses. This should be done without all of the noise from the mountain of data being held by the system.

If you could only show one thing on the UI what would it be. Maybe an alert box showing the number of critical exceptions triggered with a link to view more information? Start with that one thing and expand on it to provide the user with what they need. Big data UI is not a CRUD or basic application UI. It more closely related with what one might do for a reporting engine UI, but even that is a stretch. I am sure there are awesome blogs and books out there that speak on this subject, but many of the UIs I have been seeing were designed by people that didn’t get a subscription or something.

Keep It Simple Stupid (KISS) is as much a design principle as it is in software engineering. Stop making people work to understand thousands of data points when it should be the job of the UI designer to simplify it for them.

Example

Say you have a storage company. You have hundreds of garages and you want to deploy sensors to each garage and allow your customers to monitor them. The sensor will give you data on the door being open or closed, temperature, and relative humidity in the unit. The devices send messages every 5 minutes, 7,200 messages a day.

Your customers can opt in to SMS and email alerts on each data point. Some will only want open/close alerts. Some may have items that are sensitive to heat and humidity and they will want alerts when temperature or humidity cross some threshold.

Your customers also get access to a website that allows them to modify alerts and view a dashboard that allows them to investigate and query all of the sensor data. What good would it be to have a grid on the dashboard showing sensor data streaming into the UI every 5 minutes, 7,200 times a day. Why burden them with even having to see the data. Most customers will never have a breach. Temp and humidity sensitive customers will probably have an environment controlled unit that rarely triggers and alert. What does streaming data on the initial dashboard screen give them… nothing.

The only thing most of the customers want to know is has someone opened my unit or if the temp is fluctuating. The customer wants us to simplify all of that data into a simple digestible UI that addresses their concerns and helps them cure any pain they may experience when an alert is triggered.

There may be users that need to dig into the data to investigate, but normal daily usage is focused more on alerts and trends. Seeing all of the data is not the main concern and the data is only available if they click a link to dig into it. The UI is kept clean, not overwhelming, and focused on the needs of the majority of customers.

This is fictional and if no one is doing it, I probably just gave away a new app idea. This is the reason that IoT is hot, its wide open for dreamers. The gist of the example is there is so much data to contend with on these types of projects you have to hide it and simplify the UI so that you aren’t overwhelming the user.

Consumer Apps Are Not the Only Game In Town

Additionally, you have to take into account whether the UI is for consumers or businesses. Making things pretty for consumers can help to differentiate an app in the consumer market, but the time to make things pretty for B2B or enterprise is better served on improving usability and the feature set. I am not saying that businesses don’t care about aesthetics, but unless they are reselling your UI to consumers it’s usually not the most important thing. For both audiences usability is very important, but usability doesn’t mean using the latest UI tricks, fancy graphics, fussing over fonts and colors just for the sake of having them. Everything should be strategically implemented to help usability.

This is especially true for large enterprises. I have seen many very successful and useful apps in the enterprise that were nothing more than a set of simple colored boxes with links to take certain actions and drill down into more information. So, you not only have to take into account the amount of data being managed in the UI, but the user doing the managing. Know your audience and build the UI to their needs. Leave out all the gradients and curves and UI tricks until you have a functional UI that serves the core needs of the business and leave polish for later iterations. Focus first on how to reduce the mountain of data into bite sized actionable chunks.

Conclusion

So, Keep It Simple Stupid! Hide the data, show alerts and trends, allow drill down into the data for investigation, provide a way to take action, leave polish for later iterations.

Event Sourcing: Stream Processign with Real-time Snapshots

I began writing this a long time ago after I viewed a talk on Event Sourcing by Greg Young. It was just a simple thought on maintaining the current snapshot of a projection of an event stream as events are generated. I eventually heard a talk called “Turning the database inside-out with Apache Samza” by Martin Kleppmann, http://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza/. The talk was awesome, as I mentioned in a previous post. It provided structure, understanding and coherence to the thoughts I had.

It still took a while to finish this post after seeing the talk (because I have too much going on), but I would probably still be stuck on this post for a long time if I hadn’t heard the talk and looked further into stream processing.

Event Sourcing

Event sourcing is the storing of a stream of facts that have occurred in a system. Facts are immutable. Once a fact is stored it can’t be changed or removed. Facts are captured as events. An event is a representation of an action that occurred in the past. An event is usually an abstraction of some business intent and can have other properties or related data. To get the state for a point in time we have to process all of the previous events to build up the state to the point in time.

State Projections

In our system we want to store events that happen in the system. We will use these events to figure out the current state of the system. When we need to know the current state of the system we calculate a left fold of the previous facts we have stored from the beginning of time to the last fact stored. We iterate over each fact, starting with the first one, calculating the change in state at each iteration. This produces a projection of the current transient state of the system. Projections of current state are transient because they don’t last long. As new facts are stored new projections have to be produced to get the new current state.

Projection Snapshots

Snapshots are sometimes shown in examples of event sourcing. Snapshots are a type of memoization used to help optimize rebuilding state. If we have to rebuild state from a large stream of facts, it can be cumbersome and slow. This is a problem when you want your system to be fast and responsive. So we take snapshots of a projection taken at various points in the stream so that we can begin rebuilding state from a snapshot instead of having to replay the entire stream. So, a snapshot is a cache of a projection of state at some point in time.

Snapshots in traditional event sourcing examples have a problem because the snapshot is a version of state for some version of a projection. A projection is a representation of state based on current understanding. When the understanding changes there are problems.

Snapshot Issues

Let’s say we have an application and we understand the state as a Contact object containing a name and address property. Let’s also say we have a couple facts that alter state. One fact is a new contact was created and is captured by a “Create Contact” event containing “Name” and “Address” data that is the name and address of a contact. Another fact is a contact changed their address and is captured by a “Change Contact Address” event containing “Address” data with the new address for a contact.

When a new contact is added a “Create Contact” event is stored. When a contact’s address is changed a “Change Contact Address” event is stored. To project the current state of a contact that has a “Create Contact” and “Change Contact Address” event stored, we first create a new Contact object, then get the first event, “Create Contact”, from the event store and update the Contact object from the event data. Then we get the “Change Contact Address’ event and update the Contact object with the new address from the event.

That was a lot of words, but very simple concept. We created a projection of state in the form of a Contact object and changed the state of the projection from stored events. What happens when we change the structure of the projection? Instead of a Contact object with Name and Address, we now have a Contact object with Name, Address1, Address2, City, State, and Zip. We now have a new version of the projection and previous snapshots made with other versions of the projection are invalid. To get a valid projection for the current state with the new projection we have to recalculate from the beginning of time.

Sometimes we don’t want the current state. What if we want to see state at some other point in time instead of the head of our event stream. We could optimized rebuilding a new projection by using some clever mapping to transform an old snapshot version to the new version of the projection. If there are many versions, we would have to make sure all supported versions are accounted for.

CQRS

We could use a CQRS architecture with event sourcing. Commands write events to the event store and queries read state from a projection snapshot. Queries would target a specific version of a projection. The application would be as consistent as the time it takes to take a new snapshot from the previous snapshot which was only one event earlier (fast).

Real-time Snapshots

A snapshot is like a cache of state and you know how difficult it is to invalidate state. If we instead create a real-time snapshot as facts are produced, we always have the current snapshot for a version of the projection. To maintain backwards compatibility we can have real-time snapshots for various versions of projections that we want to maintain. When we have a new version of a projection we start rebuilding state from the beginning of time. When the rebuilding has caught up with current state we start real-time snapshots. So, there will be a period of time where new versions of projections aren’t ready for consumption as they are being built. With real-time snapshots we don’t have to worry about running funky code to invalidate or rebuild state, just read the snapshot for the version of the projection that we want. When we don’t want to support a version of a projection, just take the endpoint that points to it offline. When we have a new version that is ready for consumption we bring a new endpoint online. When we want to upgrade or downgrade we just point to the endpoint we want.

Storage may be a concern if we are storing every snapshot of state. We could have a strategy to purge older snapshots. Deleting a snapshot is not a bad thing. We can always rebuild a projection from the event store. As long as we keep the events stored we can always create new projections or rebuild projections.

Conclusion

Well, this was just me trying to clean out a backlog of old posts and finishing some thoughts I had on real-time state snapshots from an event stream. If you want to read or see a much better examination of this subject visit “Turning the database inside-out with Apache Samza” by Martin Kleppmann, http://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza/. You can also check out implementations of the concepts with Apache Samza or something like it with Azure Steam Analytics.

Recent Talks with Big Effect on My Development

I have seen many, many development talks, actually an embarrassing amount. There have been many talks that have altered how I develop or think about development. Actually, since 2013, some talks have caused a major shift in how I think about full stack development, my mind has been in a major evolutionary cycle.

CQRS and Event Sourcing

My thought process about full stack development started down a new path when I attended Code on the Beach 2013. I watched a talk by Greg Young that included Event Sourcing, mind ignited.

Greg Young – Polyglot Data – Code on the Beach 2013

He actually followed this up at Code on the Beach 2014 with a talk on CQRS and Event Sourcing.


Reactive Functional Programming

Then I was introduced to React.js and I experience a fundamental paradigm shift in how I believed application UIs should be built and began exploring reactive functional programming. I watched a talk by Pete Hunt where he introduced the design decisions behind React, mind blown.

Pete Hunt – React, Rethinking Best Practices – JSConf Asia 2013


Stream Processing

Then my thoughts on how to manage data flow through my applications was significantly altered. I watched a talk by Martin Kleppmann that promoted subscribe/notify models and stream processing and I learned more about databases than any talk before it, mind reconfigured.

Martin Kleppmann – Turning the database inside-out with Apache Samza – Strange Loop 2014


Immutable Data Structures

Then my thoughts on immutable state was refined and I went in search of knowledge on immutable data structures and their uses. I watched a talk by Lee Bryon on immutable state, mind rebuilt.

Lee Bryon – Immutable Data in React – React.js Conf 2015


Conclusion

I have been doing professional application development since 2000. There was a lot burned into my mind in terms of development, but these talks where able to cut through the internal program of my mind to teach this old dog some new tricks. The funny thing is all of these talks are based on old concepts from computer science and our profession that I never had the opportunity of learning.

The point is, keep learning and don’t accept best practices as the only or best truth.

 

Testing GoCD on Azure Linux VM

I need a GoCD server to test some configuration and API work I need to accomplish. So I thought, “Maybe I can get a GoCD server running on Azure.” There are no examples of doing this with GoCD that I could find on the interwebs, but I didn’t do a deep search. Anyway, it can’t be that hard… right?

Create VM

On the Azure portal I setup an Ubuntu VM with the lowest size, A0. I am using the Resource Manager version of docs to walk me through this, https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-tutorial-portal-rm/.

I configured username/password instead of the secure SSH, because this is a throw away.

Connect to VM

I install PuTTY on my laptop. Actually, no install I just download the exe and run it (http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html). PuTTY will allow me to SSN into the VM to manage it.
Then I just connect to the IP of my new VM and enter my username and password and I have access to a command window to manage my new VM.

Package Management

I am a Windows groupie so my first thought was I wonder if I can use Chocolatey to install GoCD. Chocolatey NuGet is a Machine Package Manager, somewhat like apt-get (clue), but built with Windows in mind.
Nope, no Chocolatey on Ubuntu, but the apt-get clue above was right on target (https://help.ubuntu.com/community/AptGet/Howto). I can use apt-get to easily install applications.

Sudo

Before I begin, I want to learn about sudo. I see sudo in a lot of Linux commands so it must be important. Now is as good a time as any to get some context because I have virtually no Linux experience (no VM pun intended).
sudo (/ˈsuːduː/ or /ˈsuːdoʊ/) is a program for Unix-like computer operating systems that allows users to run programs with the security privileges of another user, by default the superuser.
sudo -s
This would allow the current session to run as root (superuser), not sure of the significance of this or if this is 100% correct or best practice. All of the resources I read seem to indicate this as a way to elevate the user in the session to fix certain permission issues while running commands.
So sudo is kind of like a better more friendly runas command.

Install GoCD Server

I use apt-get to install the server.
echo "deb http://dl.bintray.com/gocd/gocd-deb/ /" > /etc/apt/sources.list.d/gocd.list
wget --quiet -O - "https://bintray.com/user/downloadSubjectPublicKey?username=gocd" | sudo apt-key add -
apt-get update
apt-get install go-server

The first commands adds the GoCD desbian package repo to the apt package source list. Then a key is added for the new repo (not sure how this is used yet, but something must be encrypted). Next apt-get is ran and it goes through the source list and downloads missing packages. Lastly, apt-get install is ran to install the bits. This workflow is very familiar because its like Chocolatey. apt-get handles all dependencies so this is easy as can be.
I open PuTTY, connect to the server and run sudo -s to run as root (I probably should try to install without this step because I don’t know if it is necessary). Then I run the commands to install GoCD sever and it failed on the first command with “command not found.” I updated the command to use sudo sh -c, run the command as superuser, and it ran.
sudo sh -c 'echo "deb http://dl.bintray.com/gocd/gocd-deb/ /" > /etc/apt/sources.list.d/gocd.list'
I run the rest of the commands and the magic started happening… finally.
Then the magic fades away on the last command to install go-server, Unable to locate package go-server. After poking around I figured out how to open a file so I can inspect the .list file to see if it has the URL from the first command.
sudo vim /etc/apt/sources.list.d/gocd.list
This opens Vim, good thing I have been learning Vim. Anyway, the file has nothing in it. OK, the first command didn’t work for some reason. OK, we’ll fix that later. Right now I decide to manually add the URL to the deb repo, http://dl.bintray.com/gocd/gocd-deb/ /.
I naively try to ctrl-c, copy, the URL into Vim. Of course I had another mistake, because Vim is a different animal (I have to learn how to say sit in German or something). I couldn’t figure out how to copy from the clipboard to Vim (another day) so I just typed the repo URL. As with many manual tasks I mess up and entered 1 instead of l and when I run the command again I get 404 errors (d1.bintray.com instead of dl.bintray.com). After fixing this I was able to apt-get update and apt.get install… Yah!
Now, the GoCD dashboard is accessible at http://MyUbuntuVM.cloudapp.net:8153/go/pipelines.

Install GoCD Agent

Feeling accomplished I install a GoCD agent
apt-get install go-agent
Then, I edit the agent config to update the IP to the IP of my VM.
sudo vim /etc/default/go-agent
Then, I start the agent
/etc/init.d/go-agent start
Lastly, I check the dashboard to make sure the agent exits, http://MyUbuntuVM.cloudapp.net:8153/go/agents, and it does.
Mission Accomplished!

Conclusion

This was a lot easier than I expected, even with my goofs. Creating the VM was so simple I thought I missed something (nice job Azure team).
Now how do I automate it, can I use a Docker container to make this process even easier and faster, how do I move storage to a more durable persistent location, how do I install Git (apt-get maybe)?