Monday, July 6, 2015

Cyber Security and Business Analytics: Imperfect Together

This week, I was reading an excellent piece here about the cyclical nature if the Business Intelligence/Analytics industry (BI). The assertion here is that priority tends to swing between periods of high business-driven enablement and IT-driven governance.  The former tends to be brought on by advances in technology, and the latter by external events, regulation, and necessity.  We are currently at the apex of an enablement cycle at the expense of governance. One casualty of lax governance is often cyber security.

Recently, we have seen a rash of high-profile data breaches. One of these was the large scale theft of data from the health insurer Anthem. This one was notable as it was the result of a vulnerable data warehouse where sensitive data was left unencrypted.

Those of us who practice BI and Data Warehousing professionally have a paradox to deal with. We have always been evaluated on our ability to make more data available to more users on more devices with the least effort to support business decisions. In the process, we tend to create ‘one-stop shopping’ and slew of potential vulnerabilities to those who would access proprietary data with criminal intent.

The software vendors in our space have been all too complicit in this. After all, what sounds better to the business decision-makers they market to: “multi-factor authentication” or “dashboards across all your mobile devices”? “advanced animated visualizations” or “intrusion detection”? “data blending” or “end-end data encryption”?

How about “self-service business analytics” or “help yourself to our data”? Consider how easy we make it for the users in an enterprise to export just the useful parts of a customer database, along with summaries of transaction history to a USB stick and walk out the door with it?

This idea that BI and data warehousing requires more attention to security is starting to gain traction, however. A quick web search reveals that the academics are starting to study it and the leading established vendors in the space are starting to feature it in their marketing in ways I have not seen before. See the current headline on the MicroStrategy website for one example.

The main takeaway here is that BI and data warehousing practitioners need to consider cyber security in architectures and applications the same way it is done in transaction processing:
  •          Get a complete BI vulnerability assessment from a cyber-security professional
  •          Calculate the expected value of an incident (probability of an event times the cost to recover) and allocate budgets accordingly
  •          Demand proven security technology from your vendors and integrators around features such as authentication, end-end encryption, and selective access controls by organizational role
  •          Don’t be afraid of the cloud. The leading vendors of cloud services employ state of the art security technology out of market necessity and are often the most cost effective solution available.


What’s old is new again - BI edition


Those of us with a long history as business intelligence (BI) practitioners have pretty clear memories of all the days when we saw an overhyped technology promise to change the game by freeing business organizations of IT tyranny with a new class of products that made self-service reporting and analytics better, faster, and cheaper. We saw this with the arrival relational databases. Believe it or not, they were originally all about data access not transaction processing. We saw it again when Online Analytical Processing (OLAP) was available on top of Online Transaction Processing (OLTP).  OLAP brought data access directly to our spreadsheets and PowerPoints where we really wanted it. In both cases, business organizations bought this technology and built organizations to use it thinking they could declare their independence from IT. It worked splendidly for a while. IT created extract files from their applications and celebrated getting out from under a backlog of reporting requests. Businesses felt empowered and responsive as they created reports, dashboards, and even derivative databases integrating internal and external data within their siloed subject areas.

Then reality set in.

All these new products required maintenance, documentation, training, version control, and general governance. “Shadow IT” organizations sprung up. They often became, in aggregate, far more expensive and just as cumbersome as what they replaced. Worse, the software vendors happily exploited this balkanization of larger organizations by selling redundant technology that had to be rationalized over time causing licenses to become unused and not transferable. Wouldn’t it be nice to buy a slightly used BI software license at a deep discount?

The fatal flaw in this arrangement is the proliferation of overlapping and inconsistent data presentations that we call multiple versions of the truth. These create mistrust and cause executives to go with their guts in lieu of their data.

Each of these technology advances, along with even faster hardware evolution, did have the impact of making decision support and analytics far more powerful even as the open source movement made it more accessible. This, in turn, created competitive advantage for those who learned to exploit it and made a strong analytics capability mandatory in today’s commercial climate.

One problem still remains. As we like to say, you can buy technology, but you can’t buy your data. Today’s analytics require integrated and governed data across finance, operations and marketing, online and offline, internal and external.

That brings us to the current generation of revolutionary BI tools like the latest data visualization technology that is all the rage right now. (I won’t name names, but think “T” and “Q”.) Just like the previous BI waves, they exploit technology advances very effectively with features like in-memory architectures, wonderful animated graphics for storytelling and dashboards, and even data integration through “blending” and Hadoop access. These products have been hugely successful in the marketplace and are forcing the bigger established players to emulate and/or acquire them. The buyers and advocates are usually not IT organizations, but business units who want to be empowered now.

What does this mean for business decision makers? Just like the technology waves that preceded them, these new visualization tools do not address the organizational and process requirements of a highly functional and sustainable BI capability. Data and tools must be governed and architected together to create effective decision support.  Otherwise, you end up with unsupported applications producing powerful independent presentations of untrustworthy data.

We have seen this movie before and we know how it ends.


Mr. Robinson is currently a Business Intelligence and Analytics consultant with Booz Allen Hamilton. He has previously held practice and consulting leadership positions with Ernst & Young, Oracle, Cox Automotive (AutoTrader.com) and Home Depot.com

Saturday, November 29, 2014

You say you want the real BI requirements?

We often preach how important effective and accurate business requirements are to the success if BI projects.  We also lament that gathering these requirements can be difficult.  There are any number of reasons for this. They range from analysts claiming “users can’t tell me what they want” to statements from customers like “just give me the data and I’ll know what to do with it.” That story usually does not end well.

Others like to play the Agile card and claim that documented requirements are not necessary and they will get to the right place through prototyping and iterative development.  That may happen in some cases, but it becomes hard to predict how long this will take and what the costs night be using this method exclusively.

Let me make a couple of suggestions that might help make your requirements more effective. The first has to do with process.  As business analysts, we are taught to discover, design, and document business processes so that we may find ways to enable them with information technology. But, you may say (and I have) that BI has nothing in common with a traditional business process like order-to-cash, for example. The result of the preceding question determines the next one, right? This lack of predictability makes traditional requirements gathering difficult if one goes about it using conventional methods.

Simply documenting capabilities is another approach, but it is a cop out.  How worthless to a developer is a spreadsheet that says “the System Shall…” about 150 times?
The fact is that we are in the business of supporting decision making and decision making IS a process. Think about the purchase decisions we all make.  We gather information from a variety of sources, some of it structured like price lists, supplemented by unstructured information like reviews and often aided by collaboration with friends over social media. This is a defined process that we enable with technology.

Business analysts should never begin requirements gathering by trying to mock up reports or defining the necessary data.  Instead, find out what the decisions are that are to be supported. Discover how they are made now, and then work to improve the process with BI technology.  Then the data acquisition, integration, presentation, and collaboration requirements will fall right out and you will end up with a document that takes a systemic approach to a better decision support capability.


I promised a second tip, so here it is.  When you set up requirements interviews with your customers, ask them to produce the spreadsheets they use now. Even if your customer has trouble telling you what they need, the spreadsheets usually speak volumes. They are the best evidence of how data should be managed and presented, as they are usually used to do what current reporting systems cannot.  Oh, and don’t be satisfied with just replicating what the spreadsheet is doing.  Think of them as prototypes that are limited by the technology that you are there to improve. 

Monday, October 13, 2014

Confusing BI with traditional IT applications development – 25 years later

Apologies for the delay between posts.  The long weekend afforded me the time to get back at it.

When I began my career as BI practitioner over 25 years ago, I came to it trained in Decision Sciences academically and from a professional background as a reformed financial analyst and IT business analyst / project manager.  This made me well qualified to take on early BI projects as, in the early days, most BI work was financial in nature. What I discovered pretty quickly is that traditional IT methods and skill sets did not translate very well to BI projects.  In fact, I had to unlearn much of what I had picked up managing transaction process application development in order to succeed with BI.

Part of this was the direct result of the technology.  ROLAP might have used the same DBMS technology as I was accustomed to, but data modeling took on a whole new meaning with de-normalized read-only star schema.  MOLAP?  That was another animal altogether.  Beyond that, the goal of an enterprise data warehouse required an eye toward not only the current project, but laying the groundwork for inclusion of additional subject areas regardless of how one went about that <insert Inman vs Kimball religious debate here>.

Far more important, however, was the need to ditch standard waterfall methodology in favor of iterative development cycles that would often start with a proof of concept, perhaps part of a vendor bake-off, followed by a prototype, followed by iterative builds of the application.  The good news was that the user interfaces came out of the box; some even had Excel sitting in front of them, so little development or customization was needed there.  This alone was a departure from what we were accustomed to.  All of the effort needed to be focused on the design and optimization of the underlying data structures, along with the extraction jobs from source applications that fed them.

This in turn created a need for an entire new kind of business analysis discipline that defined flexible environments to describe data navigation and investigation rather than pre-defined use cases of deterministic user interactions and transactions. Environments are defined by events, attributes and dimensions instead of business rules that characterize traditional requirements for transaction processing and simple reporting systems.  I elaborate on this in my previous post on Agile BI.

Project managers, for their part had to practice agility long before Agile became fashionable.  They learned quickly that their customers would not tolerate lengthy expensive data warehouse development cycles.  They demanded the rapid value the technology vendors promised.  This mandated breaking big projects up into smaller deliverables and iterative development that allowed users to experience capability they had never seen before and discover opportunities to create additional value even as they refined their original requirements.  Scope creep had to be assumed and managed, along with expectations.  Entirely new processes around data governance and end-user computing management needed to be developed.

BI developers came to understand that they needed to develop a certain tolerance for ambiguity in requirements and need for flexibility in their products, at times even predicting what the next set of requirements might be given their knowledge and experience with the underlying data.  This was a huge advantage on any BI project.

QA folks, for their part also needed to rethink their craft.  For one thing, it became necessary to work very closely with the BAs (if the BAs did not assume this responsibility altogether.) Assuring data quality is a very different task from testing transactions against a set of business rules. It puts an emphasis on automated comparisons with source data and working with users who have tribal experiential knowledge of the data as part of the user acceptance testing process.

So why bring all of this up now if it was all learned a generation ago? For one thing I have come to understand that there are still application development shops out there run by folks without direct BI experience that are just starting to take on BI responsibilities from their users. For another, recent technologies such as Hadoop have created a need to rethink aspects of the BI development discipline and the Agile movement has given the false impression that it translates literally to BI projects without modification.


I will comment in my next post on what characteristics to look for when staffing BI project teams in my next post. Until then, all comments welcome.

Sunday, June 29, 2014

BI by any other name

I have a bit of an identity crisis with regard to what I do professionally. 

I have spent most of my career working within what we now call “Business Intelligence” or “BI” for short. Before that, this general discipline went by some other names, including Decision Support Systems (DSS) and Executive Information Systems (EIS). Since then the software marketing machinery, with the aid of the industry analyst community, has effectively retired those terms and coined a few new ones. These include Data Warehousing and Data Management for solutions that describe back-end platforms that support the user oriented front-end solutions like Analytics and Visualization. In general, though we tend to use BI to describe the process of acquiring, managing, processing and presenting information in direct support of decision making across business processes and organizational levels.

In recent years, I have focused on the Digital Analytics discipline and tools. This term is used to describe the art and science of capturing user interactions across platforms and touchpoints. We then integrate, aggregate, model and present this data to support decisions around things like marketing spend, personalization features, content management, and product experiments.

For me, the transition is relatively seamless as I think of Digital Analytics as simply a specialized form of BI. The basic processes are the same. We acquire the data, manage it, and present it in such a way as to discover patterns, model outcomes, and track results of previous decisions. The skills required are similar as well: You need Business Analysts who can bridge between the technical and business side personnel, developers who make the tools work, QA folks to verify the results and experienced managers who can keep everyone rowing in the same direction.

Since Digital Analytics is still a fairly young discipline, there is a perception out there that it is really something distinct and different from BI. There is some truth to this in the sense that the software tools universe is still pretty bleeding edge and fragmented - new big data/data science startups seem to appear on the scene almost daily - and few true data integration standards have emerged. This works to the advantage of relatively experienced practitioners and consultants who can command premium rates by promoting themselves as Digital Analytics experts.

We have seen this all before in the early days of recognizing BI as a discipline and a wave of OLAP and SQL-based reporting tools hit the market in the 1990’s. In those days, purchase and hiring decisions were often made outside of IT as Technology executives were slow to accept the importance of BI applications and the notion of end-user computing in general. Eventually, the tools market consolidated into a small number of dominant players and BI specialists became easier to find and identify. IT moved in to regain control of the spend and inject some needed discipline into application development and maintenance.

Currently, we see the investments in Digital Analytics driven mostly by business side organizations and facilitated by vendors who offer their solutions as a service that can be stood up almost entirely without the participation of IT. I think we will see history repeat itself. Business users are realizing that Digital Analytics data has limited value until it can be integrated with data from other business process like CRM, supply chain and ERP/Financials that IT manages. The best tools will continue to be absorbed into the enterprise software firms, and issues like data governance, privacy and security will need to be addressed by IT professionals who have the necessary experience. Beyond that, the market will force a balance between supply and demand for the specialized big data and predictive modeling skills that are scarce right now. Beyond that, those who are already skilled at application development, data presentation and interpretation in other domains will adapt their skills to the Digital Analytics space.

As the majority of businesses face their customers and partners using digital touchpoints, the information these interactions produce will become part of mainstream BI and digital analytics will become just another BI data domain. IT will embrace its role in managing that data, the technology will standardize, and the valued expertise will center on the data itself and the processes that add value to that data.

Monday, June 9, 2014

Self-service BI best practices




In my previous post, I discussed some of the drivers and enablers behind the current push within many IT organizations to implement or enhance “self-service” business intelligence and analytics. In addition to all the benefits for business users, IT also sees opportunities to benefit from driving down their development and maintenance burden, increasing speed to delivery, and freeing up resources to focus on infrastructure and governance responsibilities. Of course, this all depends on a successful implementation and roll out. This is far from a given, as many shops have discovered. There are, however, some best practices I can share that can enhance user acceptance and help generate demonstrable business value. I’ll break them out by people, process, and technology.

People: Adapting an IT organization to support a self-service BI model is often a major hurdle. It is generally accepted that some variation of a “hub and spoke” model is desirable. This implies a central group of shared data and application resources along with process experts. These combine with analysts directly aligned with one or more business units. It is not that simple though. I have seen several examples where a team of business analysts, project managers, and developers have attempted to stand up self-service delivery and found it impractical without a significant re-definition of roles. The key to success in self-service is to convert the organizational model from a development orientation to one focused on effective support and overall customer service. It is absolutely necessary to reserve enough bandwidth from all resources to remain engaged with users after applications are released or enhanced. Self-service environments are never quite finished. They must evolve as business requirements and priorities change and the size/composition of the user base changes. Initial successes will spawn new requirements as users see what is possible and new subject areas are folded in.

Ideally, the role distinctions between developers, business analysts, QA, and project managers break down in favor of a core group of BI practitioners who can perform in all of these roles to some extent. Ideally, the specializations shift to subject areas and business segments; e.g. Marketing, Finance, HR, CRM, Supply Chain, etc. This facilitates the alignment with user organizations. If practical, co-locating them is ideal. Cross training is also desirable to add necessary staffing flexibility. This type of support team can shift from the traditional reactive service model (“please fill out an enhancement request”) to a proactive one that is directly involved with business area decision processes and their users’ overall experience. They can resolve issues and train as necessary; while monitoring user acceptance, usage, and satisfaction. When additional development work or tuning is required, it can be handled directly by the dedicated spoke resources; working with the hub as necessary on shared tasks involving data acquisition and administration.

The other key consideration is to what extent business-side organizations can devote bandwidth to the support effort. Ideally, some can be carved out to support and train fellow users, help debug applications and data, and participate in data stewardship. In reality though, the roles do not change, only the organization that is providing the staffing and controls the budgets. Business organizations that are shifting from hosted to SAAS BI environments are likely better served by providing their own support staffing and relying on in-house IT hubs for data acquisition and governance. This tends to be a natural fit as the SAAS vendors tend to be more customer service oriented than in-house IT departments out of necessity.

Process: Customer-focused organizations enable customer-focused processes that begin within the spokes and work their way back to the hub. One example is training. If this solely takes the form of generic tools training for all users, it will not be as effective as training that is customized by role and business area. This can be easier than it sounds if a template is provided that includes all the basic concepts that can be customized and delivered by spoke staff to specific groups with similar tasks and requirements. Commonly desired enhancements to the templates, including those corresponding to application enhancements, are communicated back to the hub for inclusion in the templates.

An analogous approach can be applied to development of generic “accelerators”. These can include report and dashboard templates, portal designs, collaboration tools, data dictionaries, provisioning models and security schema. One practice I highly recommend is a “promotion” process for successful designs. For example, when a report or dashboard design gains high acceptance within one business area, it can be adapted to others or become an enterprise standard that is centrally maintained.

This brings up another important success factor for self-service BI: Creation of a formal group that includes representation from the hub, the spokes, and accomplished users that collaborates actively and meets regularly either physically or virtually on a regular basis. I like to call it a “community of practice” but I have seen it go by many other names, including “center of excellence” although that one always sounded a little pretentious to me. These groups are very effective for knowledge sharing, promotion of the technology tools, and overall two-way communication with the user community at large. I also encourage members to show successful new applications of the technology, tools, and models off to the group. This is a great way for architects within the hub to stay close to trends and changes in business requirements.

One note about the testing process: It also has centralized and de-centralized aspects, but it is a bit different in the sense that integrated systems tests usually involve the entire user community and requires a high level of coordination and central project management. As such, it generally requires central administration with user area validation.

Technology: I do not have as much to say about technology because, contrary to popular misconception, the technology choices and system architecture are generally not a key to success. In fact, on occasion I see technology platforms completely replaced to solve what is really a process issue. Most, if not all of the mainstream tools and data platforms out there can underpin a successful self-service platform, as long as there is a well thought out user interface that is easy to use. Often, this is in the form of a portal; but can be as simple as the universal front end known as Excel. The goal is remove barriers to availability and utility by making:
  1. your applications easy to acquire and use on multiple devices
  2. your data reliable, transparent, relevant, and timely, and
  3. help available when and where it is needed

My last post in this series will discuss some of the favorable and unfavorable environmental factors as it relates to standing up a self-service BI environment.

Friday, May 16, 2014

Some thoughts on self-service BI

Let me begin this series of posts with some historical context:

Ever since the earliest days of what we refer to as self-service computing, there has been controversy in Corporate America around the relative roles of IT and the user (Business) departments in the development and administration of applications that support knowledge workers. This balance of power has been disrupted by several waves of technology, but the basic issues are always the same and persist today. In the beginning, there was pure mainframe computing; not really accessible to end users unless they were willing to write 3rd generation language (e.g. COBOL) programs on punch cards, feed them into readers, and then wait for a job to complete and get results out of a printer. Then came time sharing - the original clouds - which made things a bit more accessible and immediate, but IT was still in firm control. Even then, some business departments like Finance started hiring tech-oriented staff to configure and administrate highly customizable applications like general ledgers and the early ERP systems. These also featured proprietary programming languages that completely blurred the line between configuration and full-on application development.

Self-service computing became a wave, and quickly gained the most traction within decision support applications that were renamed ‘Business Intelligence’. IT, fearful that ‘amateurs’ were putting critical processes at risk, moved to take over that activity. This over the objections of business side management that valued their control over these functions and priorities. Turf wars ensued.

Things only got more complicated as higher level “4th generation” data handling, planning, and modeling languages, like SAS and FOCUS, were marketed to end-user programmers and their management that paid for them. At this point, there was no turning back on the expectation that application development could occur outside IT. One common reaction was for IT to create entirely separate environments, based on somewhat smaller and cheaper hardware that used extract data from the mainframes to perform dedicated modeling, planning and reporting activity in real time. This enabled dedicated end-user computing organizations that needed IT support and feeding; but otherwise operated independently as long as they stuck to supporting non mission-critical applications. (In practice, more than a few slipped through.) One immediate consequence was a massive proliferation of often redundant data extracts. IT responded by building data warehouses in the hope of regaining some order and control over the source data, if not what happened downstream.

From there, the PC revolution came along with spreadsheets, databases and more powerful higher level languages. Another layer of individual autonomy was created with all the attendant risks and opportunities. Over time, the PCs became smaller and more interconnected through wired and wireless networks until they became completely mobile, not to mention owned by users. Things became really chaotic and limited at the same time in the sense that these devices cannot natively support the level of collaboration we want or capacity to process the huge volume of transaction data we now collect.

IT was and is still saddled with the responsibility of maintaining data integrity, security, and availability across a set of processes operating over devices, applications, and networks largely outside of their direct control. Currently, this trend is being extended to main operational applications, from which internal data is still primarily sourced, as they are relocated to SAAS, or software as a service clouds. In a sense, the technology has come full circle, but the basic dichotomy of how to balance control, responsibility, resources, and workload between IT and users remains.

The other major factor driving demand for true end-user BI is the service that we have come to expect as consumers via the Internet from the likes of Google that understand natural language and make an enormous variety and volume of the world’s knowledge and data searchable and available on seconds notice. We then use social networks to share the knowledge and collaborate with friends and colleagues. We want that same power and ease of use in the workplace. As BI professionals, we strive to provide such capability that we refer to these days as self-service BI; and the software cloud service vendors are trying to sell it to us.


In the next post, I will detail some of the best practices that I have learned from organizations that have put successful self-service BI capabilities in place across many diverse technology architectures.