dinsdag 27 september 2011

Want to do analytics on large data volumes?


We would like to “do analytics” on large data volumes from 7 of your brands combined. Iincluding clickstream data. This is an example of the type questions we get. Whether it be Media, Energy, Banking or other, all want a high quality analysis on their product market combinations, want to make profiles of customers and whatnot.

An then we ask: So... How is your data?

How is it organized? How is the quality, what is the level of integration, standardization and how is it related to your well described processes?

The point is obvious, in my opinion there is no use trying to get information out of unmanaged data. When you can’t tell the level of quality of your data it is impossible to say something about the quality of the analytics results.

As Thomas Redman put it in his book “Data Driven, Profiting from your most important business asset”: “We have not even begun to understand the potential for analytics and data mining. Yet it’s reputation may be sullied, in some companies anyway, by half-hearted efforts that don’t produce extraordinary results, just as it is generally considered unwise to put in only enough energy to leap halfway across a stream, so too with analytics and data mining.”

I think this means that when we wanted to follow the Big Data hype too fast and start running analysis software on large volumes of unmanaged data, the results will be disappointing and the hype will pass by for there are few people harder to convince than disappointed business managers.

Boring how it may seem to some, organization, standardisation, description, in short management of your data is the only way to go.


If management is hard to convince of the proposed priorities, would it not help to make the value of the data and the potential value of the information that can be made, visible, to make the case?


On the London Data Management Conference I will be glad to discuss this with you.

woensdag 19 januari 2011

Idea: Antarctica, home of Cloud

Are you ready for the Cloud? The IT vendors of cloud services want to know from their (potential) clients. They've invested a great deal in Datacenters that are waiting to record the bits for people from all over the world. Business Intelligence is one of the areas that could make use of cloud services. And if it wasn't I'd still be interested in cloud because security is a data management issue.

Just this week it Twittered before my eyes: "Don't use cloud for sensitive data, EU warns members" and according to 'a recent survey' 53% of executives are uncomfortable with the 'public cloud'. Although more than 80% of enterprises have at least one cloud service in use almost 70% of executives question security of the services.

Much of the reserves of these 'executives' stem from the fear data will fall in the hands of competition, the general public (Wikileaks), or the CIA/NSA/FBI and such.

Seriously and partially understandable, organizations and especially European governments, have a no-go policy for data. It may not physically leave the country or even the building. The argument that data that is stored in a cloud, is usually divided over different servers and encrypted doesn't help to raise trust.

So how can we take away at least the fear that data leaks into the 'wrong' hands. What measures is IT industry to take to enable themselves to deliver their cloud services more broadly? (I'm not blogging about this because I want cloud services to grow for it's sake, I'm interested in the good deal that we all can get out of cloud services. I don't own any stock for cloud vendors)

I'm afraid it will be down to the skills of the salespeople. They will need to a lot of missionary work to bring around their customers to embrace the cloud. (Wouldn't you like to see someone embrace a cloud?)

Antarctica would be a good solution, so in the series 'very good FranklyBI ideas', this time I launch the idea of putting the "Cloud" in Antarctica. (previous idea was Data Fitness)
Antarctica pic from
www.antarctica.gov.au
There are several reasons for this:

  • Antarctica is owned by no-one.  There are claims but there is a consensus these claims are not granted and the interests of all parties are noted.
  • Data centers will have to be built, for the greater good of all the world, on a well selected nicely chilled spot on Antarctica. Thus saving fuel on airconditioning and cooling systems.
  • Cooling the systems will not be a problem.
  • Because there is no jurisdiction of any country on Antarctica (afaik), no agent will be able to look into the data. (Maybe have to add this to the Antarctica treaty)
  • There is enough wind and sun (in summer) to provide for the energy for the data centers.
I would like to add that  there should be a limit to the amount of data that we are allowed to store in a percentage of the total capacity. So we don't have to snowdozer the complete surface of Antarctica to build datacenters. My guess is this will all be a temporary solution as we will move the data centers to the bottom of oceans when technology allows us to do so.

vrijdag 7 januari 2011

Domme politici?

Dit is een reactie op het artikel: "5 oorzaken van politiek onbenul over ict" van Jasper Bakker in WebWereld van 1 januari 2011. Het artikel is hier te vinden. Eerder deze week op frankharland.blogspot.com.

Jasper Bakker haalt hier een heel belangrijk punt aan, onbekendheid van politici met ICT. Het gaat echter niet alleen om de onwetendheid van politici en burgers, die m.i. een plicht hebben hun electronische hulpmiddelen te kennen zoals iedere burger geacht wordt de wet te kennen. Het is meer nog een zaak van verantwoordelijkheid nemen door ICT leveranciers.

Onbekendheid is in principe niet kwalijk te nemen, de meeste politici hebben er niet voor gestudeerd en zoals terecht opgemerkt wordt:
"ict is complex. Heel complex, en voor veel mensen te complex. En politici zijn ook maar mensen. Het zijn experts die de ingewikkelde keten van de informatie- en communicatietechnologie goed kunnen volgen, en daarin ook allang in deelgebiedjes zijn gespecialiseerd."
Politiek heeft adviseurs met belangen die niet algemeen kunnen zijn en worden getrokken naar het oordeel van de grootste electorale winst. Dat daarbij niet alleen hoogleraren in de ICT zitten kan je het NL volk niet kwalijk nemen.

Ondanks alle reacties is één punt nog onvoldoende benadrukt en dat is de enorme verantwoordelijkheid van de ICT zelf, die niet genomen wordt. ICT programma (/project) managers laten zich teveel leiden door korte termijn doelen. ICT leveranciers (intern en extern) zitten in een monopoliepositie voor het leveren van hun diensten en kennis. En, de 'markt' is enorm.

Veel van deze leveranciers laten zich door snel geld leiden. Structuur en duurzaamheid van oplossingen en kwaliteit en dus documentatie en communicatie over de oplossing hebben structureel onvoldoende aandacht.

Zolang opdrachtgevers en afnemers van IT Diensten niet eisen dat die diensten aan kwaliteitsnormen voldoen, en de IT niet zelf reguleert, zullen de cowboys cowboys blijven en blijven het gepeupel en de politiek zalig onwetend.

Zynga takes over Flock

You'll probably say: "Wow ... er...", so for some of you that are not all that familiar with the 43298 companies in Social interwebworking, games and media this needs some explanation.

Zynga is the company that brought us Farmville, Cityville and Mafia Wars, games on Facebook. The new game on Facebook, Cityville got 85 million game players in one month after launch.

Flock is a webbrowser with built-in connection to and display of all of your social network sites like Twitter, Facebook, Flickr a.s.o.

Zynga made Facebook a million dollar a day, according to Techcrunch in May and won the bidding from Google and Twitter who also were interested.

So what can that bring us? To speculate, I imagine a Tweetdeck like application that has browser capability and let's us interact directly with our 'Friends' on the Social networking sites, whilst playing the games, with those friends, that are developed by Zynga.

Although I'm not much of a gamer, I gamble on one possible trend here and that is that the browser functionality could be part of the app. This versus the developments of the Web that are aimed at making functionality available from within the browser.

zondag 19 december 2010

What's up with roles in data management?

Confusion
Many requests reach my employer for help in the area of Data Management, Information Management, Business Intelligence (BI), Business Information Management (BIM), Information Architecture, Data Architecture and related roles. Also, there are a good amount of jobs available in these area’s. But when you ask twelve people to explain these specialisms, unfortunately, at least thirteen different answers come back.

There is no unambiguous understanding about the parts that have to be acted in roles like Information Manager, Data Architect, Data Manager, (business) Data Steward, Architect, Data Warehouse Architect, Data Architect and even Chief Information Officer. I think it can come across as quite immature to non-IT people (as well as to some IT people) that, apparently, we are not able to build and communicate a simple function matrix.

From my own experience, here are some examples of big differences of opinion. Take for example the “Information Manager”. While one team manager has the notion that this is a role almost equivalent to Chief Information Officer (CIO), or at least the role that fulfills the demands and requirements, that have been put forward by the ‘Data Governance’ function, with regards to enterprise wide data management. For another potential employer this role is the guy that prepares data marts and, based on user requirements, loads them with data, in order for the “BI Manager” to use it for reporting.

Another example is the “Information Architect”. One opinion is this is the designer of datamodels in the data warehouse, another opinion is this is somebody who makes a blueprint, spanning the entire enterprise, of the information and data housekeeping. That means to design who can see, maintain or manage, when and what information, enrich it, or delete it. In this process constantly upholding the strategy and essential processes of the enterprise. It is not sufficient to put a prefix like “Principal”, “Senior”, “” (sic), or “Junior” to indicate the precise content of the role. It’s just as pointless as indicating a scale for this job because the two are fundamentally different roles and matching education.

I’ll mention two more examples: The “Data Warehouse Architect” and the “CIO”. To start with the latter, my understanding was that the role of Chief Information Officer meant being in charge of the Information Managers, and ultimately responsible for the availability and quality of the information the organization has to work with. In many cases, the CIO role declined to that of the IT Manager. Often this is the geezer that purchases computers, software and other IT resources. The same goes for the role “Data Warehouse Architect”. I would plea this role is responsible for the information model, the layer model, the technical foundation (infrastructure) on which information production takes place, design of historical storage of data and backup and recovery plans.
This role is sometimes confused with Data Architect, one of the roles that has the most discussion (just like the “Business Analist” is such a role). Of the Data Architect (I’ve seen the name at a governmental institution) we can imagine the same sort of responsibilities as the Data Warehouse Architect has. The difference is in the domain. The Data Architect does what the DW Architect does for the domains transaction, process support and master and reference data, preferably on an enterprise level.

Standards within Data Management, BI & IM or BIM?
Are there no standards? Well... No, not really. But lo and behold the classification of the ‘Data Management Association’ - DAMA. DAMA did research with about 60 ´data professionals´ from all over the world, into how modern Data Management is and should be organized and executed. DAMA presents a classification of functions involved in making information and managing data. This is published in the “DAMA Guide to the Data Management Body of Knowledge” (DAMA-DMBOK Guide), an elaborate work in which all aspects of Data Management have been attended to. For this book DAMA have set the first goal to be “To build a consensus” and as second (of seven goals) “To provide standard definitions”. Below is a depiction of the model of business functions as DAMA sees it.
Source: DAMA DMBOK Guide, p. 10

In the DMBOK Guide Information- and data governance is put under the heading ‘Data Governance’, because information is seen as a type of metadata rich data with a lot of information value. Activities and areas for attention that are in the pie wedges helps us imagine what roles are necessary to provide for the functions mentioned in the picture above.

This pie shows that managing and controlling of business intelligence and data warehouse is part of the business function ‘Data Management’. That doesn’t mean ‘BI is part of data management’. A BI program can, shall, must and will have data governance and -management issues dealt with. To be able to create a valuable and complete BI implementation we’ll have to take bites of almost each wedge of the cake.

In a report of the beginning of 2010, Gartner states that, if we want to be able to manage our enterprise data, we have to put 4 key data management roles in place. One role on data management and legal knowledge, one digital archive management role, one business information manager role and an enterprise information architect. Although it is a compelling statement from Gartner, only a small part of the data management gamut is covered by it. Furthermore the roles aren’t as thoroughly defined and filled in as those of DAMA.

It is interesting to see that so little of the, hard to digest, DAMA DMBOK Guide is finding its way to the world of standards and publications. Some of the ideas are adopted and are popping up in notions, blogs and even methods. If these will connect to the original ideas of DAMA remains to be seen.

I think we should indeed take a good look into the DAMA classification. It seems to be a very usable and useful concept for a beginning of order in our profession.

vrijdag 12 november 2010

Data Fitness

In our perpetual fence match to improve Data Quality with arguments, the riposte is often vague and suspected false. The given reasons for parties to counter our initiatives on quality, often are not the real reasons. "Gut feeling" and belief are man's guides when something is difficult to comprehend.

I truly think this is a fact of life and we will have to deal with it. Even if a positive ROI for a DQ initiative can be easily made, we are baffled with amazing counter arguments. I can think of no other explanation than that the term 'Data Quality' has an image issue. And I think it's not the 'Data' part that gives the old-fashioned, scruffy librarian feeling, I think it is the 'Quality' part.

Quality is a neutral word but it has a connotation of difficulty and idealism. If someone speaks of 'Quality', usually the word 'issue' or 'problem' is nearby in the sentence. As with 'Architecture', 'Quality' sounds to the average 'business' person or -executive as something that costs money instead of immediately actively making it.

So stakeholders' arguments that bend the truth a little and turn our proposed efforts down, really are to be expected.

This is why I suggest a new, fresh approach: call it 'Data Fitness'. This has a few pro's:
- It has a positive ring to it, when someone or something is fit, that usually is perceived as a good thing.
- We (Data geeks) like to have data that is 'Fit for purpose', as an indication that is has good quality.
- In communication, when something that is spoken of is 'unfit', then everybody understands something has to be done about it. Our 'gut feeling' tells us this.
A few related concepts I suggest:
- Jog your data, when you are cleansing it, checking 'the tone' when profiling data.
- Incidental jogging does not help. A change of lifestyle is necessary for the data to become and stay fit.
- Not using the term 'cleaner' indicating better data, but 'fitter', that gives a happier feeling.

I could not think of any 'cons'. Or did not want to.

The funny thing is that when you start calling things differently you get another feeling with it. Somehow it feels natural that, when you regularly jog through your databases, data gets attention and gets fitter. If data is left without 'physical' attention, it gets heavy, slow and are sure to lose decision making matches.
People are prepared to put a lot of effort into fitness. Usually of themselves, but I think it will go for the data too. And a lot of work it is, making data fitter.

dinsdag 4 mei 2010

Who'll stop the rain

Business vs IT? Business and IT? Alignment? Data as the third angle. How people in the data-business could look at the long pursued Business and IT alignment and take the third angle.

As long a I remember the rain's been falling down. The first line of "Who'll Stop the Rain" by Creedence Clearwater Revival. This song comes to mind when I think of the search for hooks in the business and businesses where IT managers, consultants, system integrators and software vendors can attach their BI and Data quality services and products.

These consultants c.s. look for a way to appeal to the decision makers in the organizations they target for their products and services. They're trying to get on the agenda, preaching the benefits of their approaches, quality methods and tools over and over again.
It must feel like the neverending rain for these business 'targets'.

George Colony of Forrester Research says (in 2006): "IT is not the way to go, let's call it 'BT'"1. Already the change of the term "Information Technology" to "Business Technology" should make our solutions more appealing to the business. And, he says, I(B)T should stand shoulder to shoulder with business to confront the challenges the businesses face. Understand each others' domains and limitations.

IT & business still circle different circles and read different magazines. At least here in the lowlands of Western Europe. And, although the inflow of digital natives (first article Marc Prensky and second one) entering the business arena is growing every year, there still is a lot of 'senior' business management and 'senior' IT management that do not understand each other's language and don't go to each other's churches.

In the church analogy each 'group' visits their own church and listens to the preacher, priest or vicar, or whatever the name of their speaker is. They hardly ever go to the other's church. Therefore they seldom hear from the visionaries themselves the why, what and how. Digital natives are, I fear, necessary to create George Colony's "BT world".

As a descendant of the church of Information Technology I was sent out to do missionary work. We organized seminaries (sic) and invited our peers. Most people understood what we said. Sometimes we dared to speak in the domains of the other churches and non-believers. That was not very successful. Sometimes people listened and nodded understandingly but then continued their 'heresy'. We could not convert them and have them use our tools and methods.

Recently I decided to get out of this 'missionary position'2 and try another angle. I did this because I could not imagine that the amount of effort that was put into the missionary work could weigh up to the results. Regardless of efforts, not enough business people were helped with their issues.

My new angle is Data. I see a divine triangle in Business, IT and Data. Together they make one and separately we can't do without the other. In Data I see the product (and nutrition) of business(processes). IT makes it possible to monitor and steer the business(processes) with information and is a logistic service provider. In terms of responsibilities, it is clear to most people who has knowledge about the business processes and the inserted and resulting data. This, of course, is the 'Business'. The Business domain is the only domain that can be responsible for the contents, in terms of meaning, of the data. The care for data is the binding factor between Business and IT.

Lately I play the part (role) of 'Architect' for Information production environments. We use the third angle approach. The questions we now ask business management are:
1. What is the status of the Information Architecture and Governance. Do you have a clear and managed picture of your information assets and related data sources.
2. Is there anything we can do to help you with your data?

Absolute preconditions for effective data governance, quality and MDM initiatives are: business ownership of the information household and a certain level of maturity in the information architecture, governance and management. So if the picture painted in response to question 1 is not pretty, we are left with only the data-logistical (technical) problems to be tackled.

If we do come to second base here, and start talking about the solutions that may help the organization in the data area such as Master Data Management, the challenges will be in the business 'playing field'. There will be challenges yes, but if Information Governance is set up and the organization is familiar with it, it will be more like implementing "BT".
We can see the awareness of the need to better organize information within companies slowly growing but has it reached adolescence yet? And is it a bad sign that the English Wikipedia item about Information Architecture is in dire need of revision?3