I have been spending a lot of time fixing SQL Server database errors caused by stored procedures attempting to compare null. If you don’t know, in SQL:
NULL = NULL is false NULL <> NULL is false
Null is not a value. Null is nothing. You can’t compare nothing to nothing because there is nothing to compare. I know you can do a select and see the word NULL in the results in SQL Management Studio, but that is just a marker so you don’t confuse empty strings with NULL or something.
If you need to do a comparison on a nullable value please check that shit for null first:
t2.column2 is null or t2.column2 = t1.column2 t2.column2 is not null
Also, if you try to be smart and turn ANSI_NULLS off you are going to be hurt when you have to upgrade your SQL Server to a version that forces ANSI_NULLS on (it’s coming).
I have been guilty of comparing NULL and saying, “it has a NULL value.” Now that I am having to fix scripts written by someone who did think about NULL, I wanted to rant and hammer this point home for myself so I don’t cause anyone the pain I am feeling right now. Null is not a value… not a value!
I hate logic in the database. It’s hard to automate testing, hard to debug, hard to have visibility into logic that may be core to the success or failure of an application or business. Some of the worse problems I have had to deal with are database related, actually almost all of the worse problems have been linked to the database.
I am in love with the new movement to smaller services doing exactly one small thing very well. I think the database should persist data… period. Yes, there are times when it just makes sense to have logic closer to the data, but I can always think of a reason not to do it and it always goes back to my experiences with database problems. It’s been a love hate relationship, me and databases.
I’m not a DBA and I don’t have the reserve brain power to become one. So, to help my limited understanding I shy away from anything that looks like logic in my data layer. Call it lazy, naivete, or not wanting to use the right tool for the job, I don’t care. If I’m in charge get you shitty logic out of the database, including you evil MERGE statement and the current bane of my existence :).
It’s been so hard to blog lately. Mostly because I don’t have time to edit my posts. I guess if I wait until I have time to wordsmith better posts I’ll never post, so here is one that has been sitting on the shelf in all of its unedited glory.
What Qualifies A Developer To Talk About UI Design
I’m not qualified, I am not a designer. I haven’t done a lot of posts on the subject of UI design, but with the big push to big data, real-time streaming analytics, and IoT, I thought that I’d put a little thought into things that I would think about when designing a UI for them.
I started my tech career designing websites and desktop applications for a few years. Although my customers were happy with my UI designs, it’s not my thing and I don’t think I am good at it. Yet, I have been on many application teams and have had to work with many awesome designers. I believe that I can speak on UI design considerations from my 16 years of doing this. What I have to say is not gospel. I haven’t searched this stuff out on Bing like I do with engineering problems. Many UI and usability guru’s will probably crush me if they read this, but I’m not totally clueless when it comes to UI’s.
A Dashboard for Big Data
When dealing with designing an administrator’s dashboard for IoT device sensor data or maybe any big data application you should probably focus on making the right exceptional conditions highly visible and showing actionable data trends with the ability to drill down for more information and take necessary actions. The most important thing is to alert users of potential problems and anomalies that may indicate pending problems while providing some facility for taking action to investigate and mitigate problems. Just as important is being able to identify when something is going well because you want to learn from the successes to possibly apply the knowledge to other areas.
The UI should assist the user in being proactive in addressing problems. This is true if you have one device sending sensor data or a fleet of them. Granted there are differences in design consideration when you start scaling to 100s or 1000s of devices, but depending on the goal of the device, the basic premise is you want to make certain conditions identified by sensors up front and in your face.
KISS Your UI
When you have 100s of messages flowing from sensors compounded by multiple devices, a pageable grid of a hundred recent sensor messages on the first screen of the UI is useless, unless you are the type that enjoys trying to spot changes while scrolling the Matrix. The dashboard should lead you to taking action when problems exist, help you learn from successes, and give you peace of mind that everything is OK. The UI should help you identify potential areas to make improvements by uncovering weaknesses. This should be done without all of the noise from the mountain of data being held by the system.
If you could only show one thing on the UI what would it be. Maybe an alert box showing the number of critical exceptions triggered with a link to view more information? Start with that one thing and expand on it to provide the user with what they need. Big data UI is not a CRUD or basic application UI. It more closely related with what one might do for a reporting engine UI, but even that is a stretch. I am sure there are awesome blogs and books out there that speak on this subject, but many of the UIs I have been seeing were designed by people that didn’t get a subscription or something.
Keep It Simple Stupid (KISS) is as much a design principle as it is in software engineering. Stop making people work to understand thousands of data points when it should be the job of the UI designer to simplify it for them.
Say you have a storage company. You have hundreds of garages and you want to deploy sensors to each garage and allow your customers to monitor them. The sensor will give you data on the door being open or closed, temperature, and relative humidity in the unit. The devices send messages every 5 minutes, 7,200 messages a day.
Your customers can opt in to SMS and email alerts on each data point. Some will only want open/close alerts. Some may have items that are sensitive to heat and humidity and they will want alerts when temperature or humidity cross some threshold.
Your customers also get access to a website that allows them to modify alerts and view a dashboard that allows them to investigate and query all of the sensor data. What good would it be to have a grid on the dashboard showing sensor data streaming into the UI every 5 minutes, 7,200 times a day. Why burden them with even having to see the data. Most customers will never have a breach. Temp and humidity sensitive customers will probably have an environment controlled unit that rarely triggers and alert. What does streaming data on the initial dashboard screen give them… nothing.
The only thing most of the customers want to know is has someone opened my unit or if the temp is fluctuating. The customer wants us to simplify all of that data into a simple digestible UI that addresses their concerns and helps them cure any pain they may experience when an alert is triggered.
There may be users that need to dig into the data to investigate, but normal daily usage is focused more on alerts and trends. Seeing all of the data is not the main concern and the data is only available if they click a link to dig into it. The UI is kept clean, not overwhelming, and focused on the needs of the majority of customers.
This is fictional and if no one is doing it, I probably just gave away a new app idea. This is the reason that IoT is hot, its wide open for dreamers. The gist of the example is there is so much data to contend with on these types of projects you have to hide it and simplify the UI so that you aren’t overwhelming the user.
Consumer Apps Are Not the Only Game In Town
Additionally, you have to take into account whether the UI is for consumers or businesses. Making things pretty for consumers can help to differentiate an app in the consumer market, but the time to make things pretty for B2B or enterprise is better served on improving usability and the feature set. I am not saying that businesses don’t care about aesthetics, but unless they are reselling your UI to consumers it’s usually not the most important thing. For both audiences usability is very important, but usability doesn’t mean using the latest UI tricks, fancy graphics, fussing over fonts and colors just for the sake of having them. Everything should be strategically implemented to help usability.
This is especially true for large enterprises. I have seen many very successful and useful apps in the enterprise that were nothing more than a set of simple colored boxes with links to take certain actions and drill down into more information. So, you not only have to take into account the amount of data being managed in the UI, but the user doing the managing. Know your audience and build the UI to their needs. Leave out all the gradients and curves and UI tricks until you have a functional UI that serves the core needs of the business and leave polish for later iterations. Focus first on how to reduce the mountain of data into bite sized actionable chunks.
So, Keep It Simple Stupid! Hide the data, show alerts and trends, allow drill down into the data for investigation, provide a way to take action, leave polish for later iterations.
I began writing this a long time ago after I viewed a talk on Event Sourcing by Greg Young. It was just a simple thought on maintaining the current snapshot of a projection of an event stream as events are generated. I eventually heard a talk called “Turning the database inside-out with Apache Samza” by Martin Kleppmann, http://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza/. The talk was awesome, as I mentioned in a previous post. It provided structure, understanding and coherence to the thoughts I had.
It still took a while to finish this post after seeing the talk (because I have too much going on), but I would probably still be stuck on this post for a long time if I hadn’t heard the talk and looked further into stream processing.
Event sourcing is the storing of a stream of facts that have occurred in a system. Facts are immutable. Once a fact is stored it can’t be changed or removed. Facts are captured as events. An event is a representation of an action that occurred in the past. An event is usually an abstraction of some business intent and can have other properties or related data. To get the state for a point in time we have to process all of the previous events to build up the state to the point in time.
In our system we want to store events that happen in the system. We will use these events to figure out the current state of the system. When we need to know the current state of the system we calculate a left fold of the previous facts we have stored from the beginning of time to the last fact stored. We iterate over each fact, starting with the first one, calculating the change in state at each iteration. This produces a projection of the current transient state of the system. Projections of current state are transient because they don’t last long. As new facts are stored new projections have to be produced to get the new current state.
Snapshots are sometimes shown in examples of event sourcing. Snapshots are a type of memoization used to help optimize rebuilding state. If we have to rebuild state from a large stream of facts, it can be cumbersome and slow. This is a problem when you want your system to be fast and responsive. So we take snapshots of a projection taken at various points in the stream so that we can begin rebuilding state from a snapshot instead of having to replay the entire stream. So, a snapshot is a cache of a projection of state at some point in time.
Snapshots in traditional event sourcing examples have a problem because the snapshot is a version of state for some version of a projection. A projection is a representation of state based on current understanding. When the understanding changes there are problems.
Let’s say we have an application and we understand the state as a Contact object containing a name and address property. Let’s also say we have a couple facts that alter state. One fact is a new contact was created and is captured by a “Create Contact” event containing “Name” and “Address” data that is the name and address of a contact. Another fact is a contact changed their address and is captured by a “Change Contact Address” event containing “Address” data with the new address for a contact.
When a new contact is added a “Create Contact” event is stored. When a contact’s address is changed a “Change Contact Address” event is stored. To project the current state of a contact that has a “Create Contact” and “Change Contact Address” event stored, we first create a new Contact object, then get the first event, “Create Contact”, from the event store and update the Contact object from the event data. Then we get the “Change Contact Address’ event and update the Contact object with the new address from the event.
That was a lot of words, but very simple concept. We created a projection of state in the form of a Contact object and changed the state of the projection from stored events. What happens when we change the structure of the projection? Instead of a Contact object with Name and Address, we now have a Contact object with Name, Address1, Address2, City, State, and Zip. We now have a new version of the projection and previous snapshots made with other versions of the projection are invalid. To get a valid projection for the current state with the new projection we have to recalculate from the beginning of time.
Sometimes we don’t want the current state. What if we want to see state at some other point in time instead of the head of our event stream. We could optimized rebuilding a new projection by using some clever mapping to transform an old snapshot version to the new version of the projection. If there are many versions, we would have to make sure all supported versions are accounted for.
We could use a CQRS architecture with event sourcing. Commands write events to the event store and queries read state from a projection snapshot. Queries would target a specific version of a projection. The application would be as consistent as the time it takes to take a new snapshot from the previous snapshot which was only one event earlier (fast).
A snapshot is like a cache of state and you know how difficult it is to invalidate state. If we instead create a real-time snapshot as facts are produced, we always have the current snapshot for a version of the projection. To maintain backwards compatibility we can have real-time snapshots for various versions of projections that we want to maintain. When we have a new version of a projection we start rebuilding state from the beginning of time. When the rebuilding has caught up with current state we start real-time snapshots. So, there will be a period of time where new versions of projections aren’t ready for consumption as they are being built. With real-time snapshots we don’t have to worry about running funky code to invalidate or rebuild state, just read the snapshot for the version of the projection that we want. When we don’t want to support a version of a projection, just take the endpoint that points to it offline. When we have a new version that is ready for consumption we bring a new endpoint online. When we want to upgrade or downgrade we just point to the endpoint we want.
Storage may be a concern if we are storing every snapshot of state. We could have a strategy to purge older snapshots. Deleting a snapshot is not a bad thing. We can always rebuild a projection from the event store. As long as we keep the events stored we can always create new projections or rebuild projections.
Well, this was just me trying to clean out a backlog of old posts and finishing some thoughts I had on real-time state snapshots from an event stream. If you want to read or see a much better examination of this subject visit “Turning the database inside-out with Apache Samza” by Martin Kleppmann, http://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza/. You can also check out implementations of the concepts with Apache Samza or something like it with Azure Steam Analytics.
I have seen many, many development talks, actually an embarrassing amount. There have been many talks that have altered how I develop or think about development. Actually, since 2013, some talks have caused a major shift in how I think about full stack development, my mind has been in a major evolutionary cycle.
CQRS and Event Sourcing
My thought process about full stack development started down a new path when I attended Code on the Beach 2013. I watched a talk by Greg Young that included Event Sourcing, mind ignited.
Greg Young – Polyglot Data – Code on the Beach 2013
He actually followed this up at Code on the Beach 2014 with a talk on CQRS and Event Sourcing.
Reactive Functional Programming
Then I was introduced to React.js and I experience a fundamental paradigm shift in how I believed application UIs should be built and began exploring reactive functional programming. I watched a talk by Pete Hunt where he introduced the design decisions behind React, mind blown.
Pete Hunt – React, Rethinking Best Practices – JSConf Asia 2013
Then my thoughts on how to manage data flow through my applications was significantly altered. I watched a talk by Martin Kleppmann that promoted subscribe/notify models and stream processing and I learned more about databases than any talk before it, mind reconfigured.
Martin Kleppmann – Turning the database inside-out with Apache Samza – Strange Loop 2014
Immutable Data Structures
Then my thoughts on immutable state was refined and I went in search of knowledge on immutable data structures and their uses. I watched a talk by Lee Bryon on immutable state, mind rebuilt.
Lee Bryon – Immutable Data in React – React.js Conf 2015
I have been doing professional application development since 2000. There was a lot burned into my mind in terms of development, but these talks where able to cut through the internal program of my mind to teach this old dog some new tricks. The funny thing is all of these talks are based on old concepts from computer science and our profession that I never had the opportunity of learning.
The point is, keep learning and don’t accept best practices as the only or best truth.
I need a GoCD server to test some configuration and API work I need to accomplish. So I thought, “Maybe I can get a GoCD server running on Azure.” There are no examples of doing this with GoCD that I could find on the interwebs, but I didn’t do a deep search. Anyway, it can’t be that hard… right?
On the Azure portal I setup an Ubuntu VM with the lowest size, A0. I am using the Resource Manager version of docs to walk me through this, https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-tutorial-portal-rm/.
Connect to VM
Install GoCD Server
echo "deb http://dl.bintray.com/gocd/gocd-deb/ /" > /etc/apt/sources.list.d/gocd.list
wget --quiet -O - "https://bintray.com/user/downloadSubjectPublicKey?username=gocd" | sudo apt-key add -
apt-get install go-server
sudo sh -c 'echo "deb http://dl.bintray.com/gocd/gocd-deb/ /" > /etc/apt/sources.list.d/gocd.list'
sudo vim /etc/apt/sources.list.d/gocd.list
Install GoCD Agent
apt-get install go-agent
sudo vim /etc/default/go-agent
Graphical Test Plan
I read a little about graphical test planning created by Hardeep Sharma and championed by David Bradley, both from Citrix. It’s a novel idea and sort of similar to the mind map test planning I have played around with. The difference is your not capturing features or various heuristics and test strategies in a mind map, you are mapping expected behavior only. Then you derive a test plan from the graphical understanding of the expected behavior of the system. I don’t know a lot about GTP, so this is a very watered down explanation. I won’t attempt to explain it, but you can read all about it:
Plan Business Driven Development with GTP
I said we use Gherkin, but our new test runner transcends just GWT. We can define PAE in plain English without the GWT constraints, we can select the terms to describe PAE instead of being forced to use GWT which sometimes causes us to jump through hoops to force the GWT wording to sound correct.
Test Case Diagram
- execute automated checks
- open a manual exploratory test tool
- view current test state (pass/fail)
- view historical data (how many times has this step failed, when was the last failure of this scenario…)
- view flake analysis or score
- view delivery pipeline related to an execution
- view team members responsible for plan, develop, test and release
- view related requirement or ticket
- much more…
Since we also define manual tests by just tagging features or scenarios with a manual tag or creating exploratory test based feature files, we could do this for both automated checks and manual tests.