Enterprise Architecture and Big Data | EACOE.org

Enterprise Architecture and Big Data | EACOE.org


– [Samuel] Welcome. Thank you for attending our ongoing series of webinars on various topics. The topic for today is Enterprise
Architecture and Big Data. And of course, everyone knows
about the big data world, it’s a up and coming activity, and what we’re doing today
is gonna be talking about how enterprise architecture
and big data work together. We’re also gonna talk about,
frankly, some of the issues that are already being seen with big data, and where enterprise
architecture, I believe, will help quite a bit
in this particular area. We’re hearing, obviously,
lots of great things about big data, but
there’s some case studies out there right now
already that are showing us that without something prior to big data, we may have some issues that’s there, and hopefully, enterprise architecture can give us some insights into that from that perspective. The webinar today will run an hour, hopefully plus or minus
just a few minutes, trying to make this as compact for all of you
and on time as we can. And we’re gonna get started right now, so thank you, again, for your attendance, and let’s get started here. And hopefully, I can add
a little bit of humor. starting with this particular slide here. The first thing we’re
gonna be talking about is enterprise architecture. That’s my attempt at
humor, I’m not sure if that’s successful at this point or not, but the concepts of enterprise
architecture really are very important prior
to looking at big data. And this is a real enterprise, and please notice the
graphical representation of this particular enterprise, and that’s what we stress in our workshops and our consulting work. That enterprise architecture is about what we refer to as human communication. Not compiler communication at first, which we’re going to do
eventually, of course, but human communication,
and providing essentially an in context understanding of that. And we’d like to also talk to you about a little bit of history about
enterprise architecture. One of the great things about
the internet and Wikipedia and LinkedIn and all those
other sources out there is that anybody can write
anything about anything. Of course, that’s the
problem with those sources, is that anybody can write
anything about anything. And as far as enterprise
architecture is concerned, it has been around for a very long time. And this is one of the things I just want to mention
here as we get started. The person that we would suggest, and a lot of people suggest, really began this area of architecture, of something we’ve not
referred to as enterprises, but initially, we were referring
to architecting systems is a gentleman named P.
Duane “Dewey” Walker. And I’m not sure how many
of you know who he is. As it says here, he began his venture in architecture in 1966 at IBM. And a few famous people were working for Dewey Walker back then. One notably, of course,
I think most of you know, is John Zachman, and John actually worked with Dewey Walker at IBM. And I know that John
is a very famous person and a colleague of mine, and John and I started
ZIFA many years ago, and he’s still lecturing on his framework. And we’re working, of course, doing enterprise architecture
from that perspective. But the real grandfather of this area is a gentleman named Dewey Walker, and the reason this is
important for you and I is because if we don’t look at history, what tends to happen, we make the same mistakes
over and over again. And I would say that we are into really enterprise architecture
2.0 or 3.0 at this point, because it really began in those days. And as you notice here from the slide, Dewey was named a manager of
information systems planning at IBM, and a year later, he became a manager of
information systems architecture. Fascinating name change,
something to contemplate here. What they realized at IBM in 1967 is first you have to have architecture before you can plan things. Kind of an interesting
statement, that was not subtle. There’s a lot of planning
going on in organizations without architecture,
here at IBM early on, they recognized that planning came first. And this capacity oversaw
many of the developments of important systems analysis tools, and helped shared many new programs with Fortune 500 companies
that IBM was servicing. In 1970, Dewey was
commissioned to establish, get this, a national
marketing approach for IBM. What does that have to
do with architecture? Well, basically, what IBM understood, if they had the client’s blueprint, or helped the client build a blueprint of where they’re going to be and where they are, the desired state and the as-is state, they
could sell into that. Using the physical analogy,
you need 24 two by fours, a couple of gallons of paint,
and a hammer and a nail and a chisel or something like that. So they understood the
importance of architecture from a marketing perspective, and then this became an
actual service for IBM. That assignment resulted in
a highly successful program called Business Systems Planning. This is really the root, fascinating, Business Systems
Planning, we all have seen how enterprise architecture, for some are looking at it from a systems
implementation standpoint, we tend to call that type
of architecture EITA, which is really what we see
a lot of organizations doing, enterprise information
technology architecture. And really not enterprise architecture, it starts with the business first. And Dewey got a very great
award at that particular time. And I’m gonna bring up to the
camera here the actual text, I don’t know if you can sort of see that, I’m gonna run it by. I got this out of the
Smithsonian, I’m just kidding. The reason I show this to people is that this is really where this started. It did not start in the
late 80s, early 90s, it started in the late 60s. And of course, we’ve had
the privilege since 1972 of enhancing these types of things that we now call enterprise
architecture that’s there. So this was really the planning guide that IBM put out quite a while ago. What is enterprise architecture? Once again, as just a
frame of reference here as we move forward, so we
define enterprise architecture as explicitly representing
an organization’s desired state and as-is state, there’s actually two things
that we have to build. And then we can, of course,
put a roadmap together. Through a set of independent
non-redundant artifacts, quite simply saying, what
is the minimum set of things that we need for people to understand? Those are what we refer to as the architectural representations, and that is equivalent
in the physical world to what we refer to as engineering. Defining how these artifacts
relate with each other, that’s the second set of
models that we refer to, we refer to those as solution models or implementation models, and in the physical world,
those are manufacturing. And in the physical world, we know there’s two
sets of representations, engineering representations and manufacturing representations. We’re gonna suggest very strongly that enterprise architecture requires two sets of representations, the architectural representations and the implementation representations. Once again, from the
standpoint of learning, most of the people that talk about enterprise architecture
do not recognize that yet. It’s kind of sad that this has
been around for four decades and we still see that
lack of understanding of the two basic fundamental
models that are required. A lot of enterprise architecture
definitions start there. You can see that we’re
about halfway through. There, we now have a bunch of models, which are fantastic, models are great, but as you can see here, the
real objective is developing a set of prioritized, aligned initiatives. Some people now call them capabilities. We actually develop capabilities, we don’t declare capabilities, we find out from these representations, what are the capabilities
the organization needs in the future to move forward to meet the organizational goals? And then we come up to the
second from last comma, communicating this
understanding to stakeholders. And there are multiple stakeholders. There are five that we have identified as unique stakeholders,
unique classes of stakeholders that require different
presentations of this material. There isn’t one diagram that will address everyone that’s there. And of course, if any of
you have had the privilege or pleasure or frustration
in building a house, you know exactly what I’m talking about. You have the plumber’s view,
you have the electrical view, you have the heating, ventilation,
air conditioning view, different stakeholders,
different representations, and then once again, we’d
all mush those together, and of course, the phrase mush is a technical term that’s there. Advancing the organization from its as-is state
and its desired state. So this is our definition
of enterprise architecture, representing an
organization’s desired state and as-is state through
a series of independent non-redundant artifacts, we refer to those as the
architectural representations, defining how these artifacts
relate to each other, we call those implementation
representations. So for example, an
implementation representation would be business process
modeling notation, object models, data flow
diagrams, use cases, those types of things we refer to as the implementation models, that should be derived from the architectural representations
that are there. Then developing a set of
prioritized aligned initiatives, or capabilities if you’re
comfortable with that term, needed to meet the organization’s goals. And once again, the most important thing is the communications aspect. And what we found in our Twitter Facebook society that we’re in right now, is that these representations need to be just explained to the stakeholders, and this is not a joke,
in less than 90 seconds. That is an objective that we
think is extremely important. We’re not teaching them our language, we’re putting what we do in their language so they can understand it very quickly. And all the representations that we do for enterprise architecture
have that requirement. The definition of
enterprise is essentially a collection of roles and
responsibilities that are there. Sometimes that word
seems to get in the way. You do not have to do architecture of your whole enterprise,
comma, not period. What happens if you don’t? Now we have the concepts of
integration and interfacing, anything that you’ve
essentially architected, integration will be high probability, anything outside of that
boundary is gonna be interfaced. Not one bad and one good,
they’re very, very different. However, you really can’t post-integrate, so it’s very important
to get that definition. So we can have a whole
corporation, a division, you can make your choices that are there, they are very, very, very important. Now, the term architecture has been around for quite some time. It’s the art and science
of building something, and the manner in which components and artifacts are organized. The science we can
teach, equals MC squared, F equals MA, I equals D over
T, the problem is the art, it’s not really a problem, we get that through practitioning. And that’s really very, very important. I don’t think you’re gonna
get that, once again, a little smile on my face,
through passing a multiple guess exam, it really is
essentially practitioning. And that’s one of the hallmarks of the approaches that we have. So what can we learn about all
of this as we move forward? This is the baseline for
everything that follows here. What can we learn about this? I call this section my
Homer Simpson section, almost like, duh, of course it is there. But with due respect to Homer Simpson, sometimes he had some very clever phrases. So when you write things
down, when things get complex, more than a few people, you’ve
gotta write things down. Duh, of course we have
to write things down. If you’re building a log cabin, you may not have to write things down. You need to write things
down when you’re building or changing a 100 story building. And oh, by the way, if you stack 100 log cabins up on top of each other, you’re not gonna get a 100 story building. Try it someday, you’ll
find out what’s going on. This concept of iterating is fantastic, but we have to know the end state. The other thing is, we’re not gonna learn much about the 100 story building
when we have a log cabin, and we try to essentially evolve to that. It is different, it is
fundamentally different. It’s a different set of activities. If you’re building a
balsam model wood airplane, you may not have to write things down. You gotta write things
down if you’re building or changing a Boeing 747,
especially if I’m in it flying over water at night, I wanna make sure that
that particular activity has been architected at
excruciating levels of detail. But what you write down,
how you write it down, and how you build affects the
ability to change something. Yes, it affects the ability
to change something. Change comes from two things. Architecture and the
concept of assemble to order that we’ll get to in just a moment. We have to recognize that this is the baseline for managing change. And it is awful hard to
see change or strategy in 700 pages of text, or 70 pages of text as human beings. Once again, remember the human
consumable aspect of this. Architecture is really about addressing complexity and change. And big data is looking at large amounts of things that people want to change. It’s a natural progression
into that particular activity. The concepts of flexibility or agility historically come from two things. As I just mentioned a moment ago, architecture and assemble to order manufacturing or implementation processes. There’s two things that
are going on there. One of the analogies that people have said is quite powerful, and
I’m just mentioning this to you briefly, is the
analogy of a salad bar. If you can imagine, just
a moment, a salad bar which has individual elements in it that have been architected,
you’ve got the romaine lettuce, you’ve got the iceberg lettuce, you have the garbanzo
beans and the tomatoes and the ricotta cheese
and whatever you put in, the salads and carrots
and everything else, individual components, and
you walk up to the salad bar and you assemble to order. Now, if you’ve got 16 bins there, imagine how many salads
that you can produce. It takes a little bit of time. The alternative is provide from stock, you have it right there, you go next door, you have a prepackaged salad
that’s hermetically sealed, and then you go back to your office and you try and take it apart because you decide you
don’t like green peppers and you don’t like this or like that. That’s the alternative,
one is extremely quick, and one is pretty quick once we have the architectural elements. Now you’re seeing the
difference between the concepts of architecture and implementation. Architecture is the 16
elements on the salad bar, the implementation is you
grab those things out of that. Two different sets of models, architecture models,
implementation models. It’s difficult to believe
that agility comes from hand crafting things
smaller and faster, we just cannot keep up. Agile enterprises are the ability to address complexity and change, graphical representations are the key to addressing complexity and change. And there’s fundamentally
two types of representations, those architectural representations and those implementation representations, recognizing the audiences. Subject matter experts are
different, that’s there. Just continuing in just a moment here, this concept of speeding up change. We need to look outside of
IT for just a little while for enterprise architecture and big data and see what other professions have done. There’s a tremendous amount
of literature out there outside of the IT literature
that is quite powerful. And one of the things
out there is this concept of the general
manufacturing maturity model that provides us with an
understanding of the baseline on how organizations change
in the physical world. And our approach, essentially,
is to look at that and see what we can borrow
from those disciplines that have been out there for centuries, if not longer, that’s there. Not everything is translatable. One of the things that we
always hear, well, yeah, but a 100 story building
doesn’t change very much, and our enterprise systems
change all the time. There’s a perhaps here, perhaps it’s the way we
build our enterprise systems, and the lack of architecture
that is causing that issue, rather than saying IT
systems change very fast and buildings don’t. So let’s look it a little differently. Perhaps it’s the lack of architecture and the way we build systems
that are causing this. So let’s go on for just quickly here. The general manufacturing maturity phases. The first phase is what’s
referred to as make to order. And most of our organizations that we have the privilege of working with,
frankly, are in this area. If you hear of the concepts
of requirements definition, the concepts of specifications, screen design report layouts, that is what we refer to as make to order. Essentially, organization is waiting there for something to come in the door, we refer to that as make to order. Long lead times, high costs,
generally low reliability. Second phase of maturing is what we refer to as provide from stock. And in the IT world, that is commercial off the shelf packages, COTS packages. So we can see where
most organizations are. One of the things that’s very important about commercial off the
shelf packages, COTS packages, is that if we start adding functionality, whether it’s through
APIs or whatever it is, we have to recognize with eyes
wide open what’s happening. We’re actually going back to
an earlier stage of maturity. You are increasing the
complexity of the organization. Now you have the packaged software, you have the custom modules,
and you have the interfaces. I’m not saying don’t do it, but make sure that you understand that when you are modifying these packages that are provide from
stock, you have increased the complexity of the organization, and that’s something
we need to make sure of comes through loud and clear. We do get reduced cost, we
do get higher reliability, the issue is limited flexibility. And the most powerful
of the maturing levels is what we call to as assemble to order. This is what we believe the enterprise needs to strive for if
it’s not there already, and most enterprises around
the world are not there yet. It’s just a matter of time as we see in the physical
world, excuse me. This concept of assemble to order provides us with almost custom products, high reuse, reduced time to market, and mass customization
in quantities of one. This concept, ladies and gentlemen, is all around us in our daily lives. I talked to you about the salad bar, walk into a big box store like Home Depot or Lowes or Best Buy,
they have figured out what essentially the store layout is to provide you with the
greatest flexibility. Let’s talk about walking into a store that provides you with
the basic foundations for building a home. There’s the lumber department, there’s the windows or the doors section, there’s the roofing
section, there’s the nails, the hammers, the chisels, the paint. Those are the elements,
the architectural elements, and you put them together to build an implementation composite that’s there, assemble to order, all around us. A menu that you go to a
restaurant, assemble to order. You can see how powerful this concept is, and it’s something our approach
to enterprise architecture is fundamentally trying
to educate people with, and that’s how we practice. It is so basic to speeding up change that it is going to essentially provide us with that baseline that we need. We’re extremely excited about this, because people are
starting to pay attention in this particular area,
what a powerful concept. And all we had to do,
essentially, is to look outside of IT for just
a little bit of time and build that analogy. And our thanks to, essentially, the pioneers in the area specifically, especially Dewey Walker, that actually provided that baseline for us many years ago, that’s there. The concept of a framework is something I just want to go over very quickly. It is, unfortunately, a
woefully misunderstood activity in enterprise architecture. A framework is inert,
it doesn’t do anything. It is a thinking tool. It’s a very important thinking
tool, so that’s what it does, but it is not prescriptive. That is a methodology. And in our business of
enterprise architecture, we don’t yet have an understanding
of what a framework is, let alone essentially which
framework we’re going to pick. And I’m not gonna get into that today, because we know that’s
a nightmarish topic. I wish I could. But this is about big data. This is to us an example of a framework. The reason I mentioned this one here is that this is Mendeleev’s
periodic table of the elements. And it didn’t matter
what you called yourself prior to Mendeleev figuring this out. You could call yourself a
chemist, but you are an alchemist. The framework provides us with a baseline for the profession, but we have to have an honest, good framework. We can’t just have roll
your own frameworks that are out there or something like that. And this is the architectural
framework for chemistry. And with methodology, we produce implementations which are compounds. The architectural
elements are on the left, the implementation
compounds are on the right, the way we get from
architecture to implementation is through the concept of a methodology. Here is the framework for music. Every profession seems
to have a framework. There’s a hint here. Once we have a framework,
we can get a profession. We’re getting there in
enterprise architecture, we’re not there yet, ’cause we’re still arguing about what a framework is, let alone picking the one that’s there. This is a fascinating
set of understandings we get from music that’s there. And of course, what we
have using methodology is a series of implementation
or compounds that are there. Very, very powerful
concept that we see here. The last one I want to
do is this one here, I’m trying to use this
one with you right now, the 26 letters of the alphabet. Maybe I’m doing a good job, don’t know. This is what I’m trying to use right now as the frame of reference
for communications with you. And of course, there’s an
extreme number of compounds that we can produce once
we have that understanding of what those elements are. But the elements do not tell us, the architectural elements,
which is the alphabet, do not tell us about I
before E except after C, or how to build a sentence, or what the definitions are of the words, that is architecture on the left, it is implementation on the right. And this is one of the concepts that is yet to be understood well with enterprise architecture. Most organizations, once again, are building implementation composites. And that’s why we’re having troubles out there in architecture, ’cause we’re actually
not doing architecture, we’re doing implementation modeling, which is good, fantastic,
but it’s not architecture, it’s something a little bit different. So we’re gonna suggest, essentially, that a good framework has a
good definition of artifacts. It has six, excuse me, five elements, what, how, where, who, when, and why. And if there is another interrogative, then somebody’s got a Nobel Prize. And on the right hand side,
I give you the translation of what essentially
those what, how, where, who, when, and why elements are. These are what we refer to as the architectural artifact understandings. The second definition that
we need is, essentially, the stakeholders and transformations. And basically, there are people that need to describe the business, there are people that need
to define the relationships, there are people that need
to specify the components and the services, there are people that need to identify the technologies, and finally, the people that
need to select the solutions. And this is not a decomposition,
these are transformations. It’s a very different concept. And essentially, this is
the enterprise framework that we present to you
and to our client base and to people that are
educated in our process, and it is essentially an elaboration of the fine work of John Zachman. So this is essentially the baseline of what we talked about
before we get into big data. And what we’re gonna be talking about, of course, in big data, is talking about this particular column, what we refer to as the materials column. And as you can see here in row three, is when we introduce the concept of data. So what I’ve done for
you very briefly here is just to give you an overview of the concepts of enterprise architecture as we essentially move into
these concepts of big data. And we refer to all of this as holistic enterprise architecture. And it requires a frame of reference, every science that we’ve seen out there, prior to it becoming a profession, needs a frame of reference, a framework. It requires two representations. One is what we refer to
as architecture models, in the physical world, we refer to those as engineering models. It requires implementation models, architecture implementation,
implementation manufacturing. Engineering models, manufacturing models in the physical world, in the
enterprise architecture world we refer to these as
architectural representations and implementation representations. It requires a methodology that is represented by a true framework. Methodologies work with a framework. There’s no such thing that we can tell of a method that’s neutral to a framework. We haven’t found one
in the physical world, and if someone comes up with one in the enterprise architecture
world, then once again, Nobel Prize time here. We just look at things very practically. We use the hmm test a
lot, we use the hmm test. What is the hmm test? If it doesn’t make sense
in the physical world, it’s probably not gonna make much sense in the enterprise world either. And that hmm test is one of the most valuable tests that we see here, especially, unfortunately nowadays, with all the things that
are out on the internet. It enumerates all the
architectural models. It essentially enumerates all
the implementation models. There is a finite set, a finite set of architectural representations and implementation representations. And it guides the practitioner
in the development of architecture models
and implementation models. That’s what a methodology does. And it results in initiatives to move the organization to its desired state, it does not stop at the modeling
activities that are there. Now, in summary, before we get into, essentially, now, this
concept of big data, here is what we believe is the state of the practice of
enterprise architecture. This may be a little bit
uncomfortable for people, but we sort of need to
know where we’re at here, because I think that we
need to position the science of big data, not sure the
word science is correct, big data analysis, in the context
of where it’s coming from. We’re gonna suggest to you,
number one, an agreed to definition of enterprise
architecture is lacking. There’s a lot of ’em out there. But we’re not gonna get anywhere until we essentially
understand what this is. Number two, an agreed to definition of a framework is lacking. I didn’t say which is number three, to pick a framework, but actually, what a framework is, a
framework is not a methodology. And the framework is
essentially the fundamentals for a profession, we have to have a framework. There is one in medicine,
there’s one in accounting. Every discipline has one. Electrical engineering, civil engineering, manufacturing, operational engineering, all these disciplines have a framework, and then, essentially,
the profession begins. So number three, an agreed to framework, once the definition of framework
is established, is lacking. We, if I can humbly suggest,
have gone past this, we believe that that is settled
as far as we’re concerned. And the reason we say
that is that we can take what other people suggest are frameworks and map it into what we use, and we can do that every time. The converse isn’t correct. If they can’t take the
framework that we use and map it into their
world, something is lacking. And again, I would use the hmm test at that particular point. The state of the practice, number four. Most modelings, comment,
what we refer to as EITA, enterprise information
technology architecture, is around systems and implementations. That’s what we suggest is going on today in the work that we see out there. And people call it by various things, application architecture,
information or data architecture, we suggest, by the way,
information is different than data, a topic for another day,
technology architecture, and we sometimes refer to this lovingly as the BAIT model, business architecture, applications architecture,
information architecture, technology architecture, we suggest are what we refer to as EITA and not EA, because we are putting
composite models together. I wanna stress again, we’re
not saying it’s wrong, we’re saying it’s different. And it actually should be the second step in our architectural
activities, and not the first. Number five, we now see business architecture popping its head as being the next evolution. Well, if you use the proper framework, you can position business architecture within the enterprise
framework that we showed you. So it’s not something new, it’s something that people are looking at because they don’t have a proper frame of reference just there. In the enterprise architecture
framework that we showed you, the work of John Zachman
is the beginning of that. And the framework that he has is, if I can use a phrase, one
of the same in what we have, what we’ve done is put a
little bit of elaboration on there for practitioning. Business architecture sits right there, it always has, as many other disciplines. So once somebody tells us
what they’re developing, not through hocus pocus or words, but says, these are the
artifacts that we’re developing, we can map it and put a label on it, and sometimes we put a label
on it as business architecture. So a proper framework will enumerate the various architectures
in an enterprise, of which business architecture is one. Number six, and this is what leads us to essentially where we are right now, the modeling of data seems to be the most advanced from
both theory and practice. And we think that’s why
this concept of big data is getting some attention right now, for a number of reasons. One is there’s a lot of data out there, I think all of you would
suggest the same thing, there’s a tremendous
amount of it out there. And people want to do something with it. We do have to suggest that
most of the data out there, though, has not been architected. And that should give us just a little bit of pause here for a moment. It’s kind of scary in some areas. So that’s where we think we are, which leads us to the practice that a lot of people
are calling right now, big data, you know, that’s there. And as we transition to
this concept of big data, I’m just throwing up here
essentially a life cycle, plan, analyze, design,
construct, a methodology. This isn’t, of course, exact
to what you possibly are using, but most of our organizations go through some life
cycle to build systems. Plan, analyze, design, construct,
or something like that. You may have a few more phases, or you obviously have a lot more detail. But basically, there’s some concepts, essentially, from conceptualization
of what you want to do with your business people or
your subject matter experts, all the way through some
development of some solution that usually leads to some mechanization. And there’s a series of representations, graphical representations,
in most cases, hopefully, that people have put together. And I’m not suggesting
this is the correct ones, again, I’m trying to give you
a context of what is there. The thing we have to
recognize about big data is that not all data has
gone through this life cycle. And I’m chuckling because that’s what we’re seeing is one of the issues. What is it that you’re looking at when you see data out there? And how do you actually get the intent from the business from where we’re at? So what we’ve seen is some
interesting things out there. This is a very current,
as you can see here, little article that Tom Davenport wrote just a couple of days ago. And I just picked out, again, some big data information
out of this thing. And this is, again, for
you and I, the hmm test. As it says here, first thing
is about American Airlines. And by the way, this was his work that he did 10 years ago
at American Airlines. As it says here, at American Airlines more than a decade ago, they told me during a research visit
they had 11 different usages of the term airport. Now I’m a frequent flyer,
this scared me tremendously. As he said here, as a frequent
traveler on their planes, I was initially a bit
concerned about this. But when they explained it, the proliferation of meanings made sense. They said that the cargo
people at American Airlines view any place you can pick up or drop off cargo as an airport. The maintenance people viewed any place that you can fix an
airplane as an airport. The people who work with the International Air
Transportation Authority relied on a list of international
airports, and so on. This scared me, it made Tom comfortable, but it scared me. If we try to take these
disparate databases that have different definitions
of the word airport, what are we gonna get? Well, I wanna make sure that
the pilot flying the airplane knows which word is being
used in what context. And if we mush it all together,
how are we gonna do that? How are we gonna get the semantics? Now, please remember that Tom’s article is a recognition that, essentially, the meanings made sense. How do you have the unique meanings when you essentially
have a running database? Food for thought. The next week, I was doing
some consulting work, he says, at Union Pacific Railroad. They sheepishly admitted at some point, I’m sure he had to coax this out of them, that they had great debates
of what constitutes a train. For some it’s an abstract
scheduling entity, others, it’s a locomotive, yet for others, it’s a locomotive and whatever rail cars it is pulling at the time. Frankly, not as big a
concern as the airports, but the same thing. The question is, when you
use data out of context because you don’t have
the traceable context, once again, the traceable context as we’ll be talking about in a little bit, gotta be a little bit careful. And I believe that enterprise architecture is going to address this, as we’ll see in just a few minutes. Financial Times, earlier in the year, Big Data, Are We Making a Big Mistake. By the way, this presentation
is not about killing big data. It’s about opening everybody’s eyes and saying, what do we have out here, and how do we make this work? How do we make this work? But the question is, how do you take existing
databases and use it? Not how do you do it tomorrow, and that’s what people
have been trying to do. So Tim Harford wrote an
article in Financial Times, incidentally, I’m finding
a tremendous amount of understanding about
enterprise activities in technology, as you’re seeing here, not from the IT literature as
much, but from other areas. And so more and more reading outside of IT is becoming important, because that’s when we
start seeing the effects of enterprise architecture, excuse me, effects of enterprise information
technology that’s there. So, Tim says, cheerleaders for big data have made four exciting claims, each one reflecting in the
success of Google Flu Trends. If you remember a while back,
Google had put together, using all of the data
they have on you and I, trying to figure out what the, essentially, flu trend was going to be in various areas in the United States. From a standpoint of the
pharmaceutical industry, unfortunately, these
predictions cost them billions, billions of dollars in inaccurate
activities that are there. And there’s a number of different reasons that this occurred. The data analysis produced, essentially, for four exciting claims,
each one reflecting in the success of Google Trends, and these were the claims, that data analysis produces uncannily accurate results, excuse me, that every single data
point can be captured, making old statistical
sampling techniques obsolete. That was the failure, by the way, the major failure in
the Google Flu Trends. It didn’t capture everything,
because the number of people that actually have internet
access is an issue. The number of people
that are using Google, that are not all the people on the internet is also an issue. So you don’t have a 100% sampling size, it wasn’t even close. That is passe to fret
about what causes what, because statistical correlation tells us what we need to know. That’s another claim that has
been proven to be incorrect. And that scientific or
statistical models aren’t needed because, to quote “The End of Theory,” a provocative essay
published in “Wired” in 2008, with enough data, the
numbers speak for themselves. All of these have been shown in this article to be a problem. Let me go further, they
were actually false. And we’re relying on science, but the question is,
what’s the underlying data? We’re forgetting about, essentially, understanding the principles
of design of experiments and all these scientific principles that have been around for some time, in the quest, in the desire to essentially use this
in a different way. It’s very, very, very
important that we understand where we are, because
then we can address it. And I believe enterprise architecture, again, is gonna get us there. Unfortunately, these
four articles of faith are at best optimistic
oversimplifications. At worst, according to the person that studied this at Cambridge University, they can be complete
bollocks, absolute nonsense. There’s another word that
we can use for bollocks, but we can’t do that in
mixed company that’s there. Caution, caution, caution, caution, is what really we need to do. If we look at this page here, we have essentially
four different phrases, and the reason I need to talk to you about this for just a moment, is to get a little bit of understanding of what we’re trying to do. There is forward engineering, which essentially is plan,
analyze, design, construct. That is the precursor of Figure 3b which is reverse engineering, which is trying to take
running data structures and figure out what the
business intent was. That’s what we’re trying
to do with big data. But all we have here is
running code in general. Now we’re gonna talk
about how we can do this if we have these intermediate steps. If you have something forward engineered, then there’s a strong possibility that you can reverse engineer it. Another phrase that’s out
there is restructuring. And restructuring essentially
does, as you can see here on Figure 3c, what that
does is essentially manipulate things within its own context. So we restructure a
database, that’s in here. We restructure the relational tables to make a database, that’s over here. So it’s in the concept, we’re
not trying to interpret, we are just essentially restructuring, we’re in the context of where we are. And finally, the concept
of re-engineering, which is different than
reverse engineering. What we’re trying to do in re-engineering is redeploy that particular
data perhaps, or process, into a different platform
or different venue. So we’re going from mainframes
to servers to mobile, those types of things is what is going on in this particular area
as we see it, okay? So what we get into right now is this concept of what we refer to as reverse data engineering. And in order to get into
reverse data engineering, we have to essentially look at it from this particular standpoint. And reverse data engineering is actually what we suggest we’re
trying to do with big data. Get the understanding of the
data from a running database so that we can reuse it
in a different format. So what I’m trying to do here
in our presentation there is put a little method to the madness that possibly is out there. How do you actually do this? And what is the qualifiers
that are required as we essentially do this? So, reverse data engineering assumes that all data design intent has been carried forward
into construction. So we actually have taken
the business understanding, the semantics, not just the syntax, the semantics, and brought
it down into construction. And it’s simply somehow encrypted, when I mean encrypted, not confidential, but it’s somehow in the
structure as we see it. In practice, we find that the transition from analysis to design
from design to construction loses certain details. That’s what we have to essentially
understand that’s there. And as we move forward into this, we find out that our data
model’s going to databases, we lose some semantics,
we lose behavior rules, we lose some of that structure because if the only thing
we have is running data and the relationships between data, we’ve lost some of those
things that are there. And then when if we translate
our database designs into relational structures,
we lose design intent, such as the reasons for
making design choices. What was the reason we did this? Why did we use an index? Data integrity. Implementation doesn’t show us that. So essentially, if we
attempt to reverse engineer from a system whose documentation is lost or seriously out of date
from years of maintenance, we will become aware of
business requirements and design intent that
can only be recovered by human insight and retained knowledge. This is what we need to understand. I’m not saying, don’t do big data, but recognize that a lot of the reasoning for the data being there has been lost. This is the most
significant translation loss is the loss of meaning,
not loss of structure. So let me say it more directly. Let me say it more directly. Which definition of airport
do we actually have? Now what we’re gonna do
is show you an example of what this is to get you thinking about why plan, analyze, design, construct will be able to address big data. So if we look at this particular diagram, it shows an example of a data model that is for an order for a film, it doesn’t really matter what this is. And if some of this
terminology and symbology is a bit off to some of you, I wanna just explain
it to you a little bit, because this model here
is really what provides us with an understanding of
what the information is, the data is that we have. And the question is,
how does that actually transform itself into a running database? This data model essentially
describes the customers as we can see here, it
describes the orders, it describes the geographic locations. And one of the things that
we can see here, essentially, is that each customer is located
in at least one location, and could have multiple
locations across multiple states. So each customer is located
at each one location, and possibly multiple locations. And this is, of course,
the symbol for one to many. I don’t wanna go through
all the data modeling techniques here for you
right now, but basically, this model provides us with lots and lots and lots of information that comes from the understanding, this
semantic understanding, this is what the business people told us, a customer can be located
in one or many locations. And that’s a very
important thing to be able to understand as we see it. And of course, a complete data model, I’m not gonna go through
the other symbologies here, will normally have a definition
for each named component, what is a customer, what is an airport, which definition of
airport do we have, okay? Just think about trying to
take all these things together, all you see here is AARP or a key, we don’t know exactly which one it is, ’cause that’s what we’re trying to do, is to bring all the stuff together, ’cause we were told that that’s the usage and that it is different. And essentially all the
behavior rules that are there. So what we have, essentially,
at this particular point, is an ability to actually make some sense out of what is going on. Let’s go a little bit
further right now, okay? Let’s take these requirements, again, plan, analyze, design, construct, and now move it down its life cycle. Now business requirements are
captured by the data model and normally translated into, essentially, a database design, which
sort of sounds pretty good. Once again, throwing a bit of
terminology at you right now. What we have here is
essentially a normalized relational design for
those data requirements. And, again, my apologies if this symbology is a bit confusing to some of you, we really need to show this to you so you see what to watch
out for that’s there. And essentially, what we have here is now the transformation of the
semantic understanding into a relational structure. So for example, IEDOrd01 is the order entity type. So we’ve transferred the
concept of order, O-R-D-E-R, into a relational structure, and here’s all of its
attributes that are there. And for them, some of you that
are comfortable with this, the key elements, the
foreign keys that we need, essentially, to move forward. And so what we’ve taken is
that semantically rich model and translated and transformed it into a relational structure that is there. So far, we have a little
less understanding. But I’m sure for some of you, it’s raising some eyebrows at this particular point when we see this. At this point, we’ve simply
taken the data model, as it says here, and
produced an uncompromised relational design equivalent, yet we’ve already lost some meaning. Now let’s look at what
happens if we were to reverse engineer from
these relational tables. As you can see here, all of a sudden, already, from that relational structure, we have some unfortunate question marks, because we’ve lost that in the table. We find that if we come back
to our customer activity, we have lost the mandatory in precisely one record under customer. Hmm. And as you can see here,
all these question marks, once again, without going
through all these details for us here right now,
through our webinar, are question marks that we’ve lost not because we’re bad designers, but because the modeling
of the requirements, the semantics, is naturally lost as we move forward in the data structures. We need something else. It exists possibly in the programs, but not in the data
structures, hint, hint, hint. Or it exists in the documentation
or in people’s heads, but it doesn’t exist in
the data model alone. Let’s go a little bit further. Let’s go into, essentially,
now moving forward into a year after this has been built. Tuning and maintenance. There’s three things that we see here. We’ve lost the business rules. Relational tables alone cannot
express certain constraints. If we were to reverse
engineer these tables, we would end up with semantic loss of certain types of business rules. It doesn’t exist in the data alone. It’s not an argument
against relational tables, we’re simply demonstrating
that we currently lose important details as we
proceed down the life cycle. This is something that
we have to recognize. In the olden days, we used
to be called data processing. Now we’re called IT, but
we still process data. Without that context,
we’re gonna have a problem. Our loss worsens as the components are subject to maintenance. Now I’m gonna show you some of this on the next model that’s here. After tuning and maintenance, we get something that looks like this. Wow. Now the wow should be for a
couple of different reasons. This is essentially a tuned and maintained data structure that’s there. Because of this loss that occurs as we proceeded down the life cycle, as you can see here, reverse engineering cannot completely reconstruct
the database design, excuse me, the data design intent, not the database design,
the data design intent, or the business data requirements. So the question for us is
at this particular point, how do we get from this understanding, which is really where we are
when it comes to big data, we have running data structures,
to this understanding? And what’s more important is what happens if we try to take the data and make inferences out of it? And so what I’m leaving you with is that big data is dependent upon architecture. Plan, analyze, design, construct. If we don’t have this, we essentially have a hmm right now. So what we’re suggesting
is that quality data must precede any big data activities. How do you get quality data? Through enterprise architecture, through, essentially,
an architected approach. Most systems and associated database have not been forward engineered from what we can tell. We don’t know everybody, and we don’t know all of you online, and we haven’t had the privilege
of working with everybody. But most of the things haven’t
been forward engineered with full traceability from
the business understanding. Most organizations have
no metadata understanding, assuring that data is in one database, for example, the term customer, or the term airport or the term train has the exact same meaning in the structure in another database. Consolidating data
using any new technology will not address these issues. It’s not about mushing things together, it’s about understanding things. Oh, by the way, mushing
is a technical term, as I think I mentioned. It’s not a technology issue. So we wanna be a little bit careful as we essentially move forward. So how do we actually do this? How do we actually address this? So what we need to do is to figure out, as you can see here on these blue arrows, is how do you get from a running structure to something different? And we’re gonna suggest to you, the way that we do that is
through enterprise architecture, basically, a top down
architected approach. And the top down approach
is not decomposition, but the transformations
from the business intent into the relationship understanding, to a technology neutral,
to a technology specific, to a solution specific,
with all that traceability and forward engineering,
then we can reverse engineer and we will have, essentially, a successful big data activity. Without this, all I can suggest to you is it’s a bit of a
crapshoot, we don’t know. I’ve given you some examples
of areas that you can see, I think, quite clearly. At American Airlines, for example, how difficult it would be. So in your own organization, when you’re attempting to do big data without a forward engineering, or the phrase is, essentially, enterprise architected approach, we need to be very careful
and look at not only the name, not only the name, but its context, and then see if we can make
heads or tails out of that. Some of you, I believe, will
have some success in this area, but I think we’ll have a
tremendous amount of success if we have an enterprise
architected approach. So you can see how with
an architected approach, we can go from the business intent to the business
relationship understanding, to the technology neutral representation, to the technology specific representation, to the solution specific representation, to the implementation. Once we have that, then on this
chart, we can go back again, and we will have a
successful big data activity. I hope that our brief time together has given you some
insight into this concept and these activities that
are so popular right now. And of course, in an hour session, we can’t cover every
conceivable nuance that’s there. We’ve only lightly touched on this. But hopefully this
presentation gives you enough to have your organization think about this as you go forward. Obviously, we would suggest
enterprise architecture needs to come prior to
any big data initiatives. I thank you for your participation, your listening today. It’s a interesting week for all of us as we move into the holiday period. I wanna leave all of you
with a Merry Christmas, a Happy Hanukkah, a Happy Kwanzaa, whatever holiday you’re celebrating, or if you’re just taking
a little bit of time off, hopefully you’ll enjoy yourself. I hope this added to your insight into these particular topics. We will be asking you if we
can, if we can indulge you, we’re gonna be sending
you a bit of a survey to ask you how you enjoyed
this, if you found it useful, and if there’s any other topics that you would be interested in. The session has been recorded, and if anyone is interested
in seeing that information, we’ll be more than happy
to get that to you. From our Enterprise Architecture
Center of Excellence and our Business Architecture
Center of Excellence, once again, thank you very,
very much for your time, and perhaps we’ll be able to
work with you in the future. My best to you, my best
to all, thank you again.

Leave a Reply

Your email address will not be published. Required fields are marked *