Posted November 21st, 2008 by Joe Pendry
CA unveiled their cloud strategy this week. One of the benefits they listed as a need for virtualization management was employing change management for dynamic virtualized environments. The CA Service Desk Manager’s change management function helps to govern approval of data center automation policies. Change mangers can analyze how automated policies are complying authorized configurations through the CA CMDB’s automated application discovery function and tracked automated changes.
The downtime winner of the week is Twitter! Sporting a new downtime image, Twitter experienced a significant amount of downtime this week. SMS delivery problems are speculated to be a reason for the downtime, since the Twitter Status blog did not indicate the reason for the database efforts.
ITIL v3 recommends using Key Performance Indicators (KPIs) to measure the effectiveness and efficiently of the ITIL process. The problem is ITIL is not a regulatory for measuring services. The business values of the KPI service are utility and warranty. CIO provides a seven step improvement process to identify KPIs for a service. The steps include defining what you what you should and can measure; gathering, processing, then analyzing the data; presenting the information and finally implementing corrective action.
According to Michael Keen, Director & Senior Solutions Architect for the Enterprise Architecture Group at Alliance Technologies, enterprise architecture is just as important as IT governance. He says that enterprise architecture shows how all the components of the enterprise are related and it provides a framework for supporting and automating business processes, especially with change. Keen claims that clients should not ‘set and forget’ enterprise architecture because it is a dynamic, disciplined and ongoing process. If enterprise architecture is properly envisioned and implemented, it will enable IT orgs to respond to change rapidly.
Popularity: 1% [?]
Filed Under: Change Management, Downtime, IT Operations Research, ITIL
Posted November 20th, 2008 by Dennis Powell
Cloud computing, a blistering hot buzz-term in today’s market, has the potential to reshape the way that IT delivers service to the business organization. Or does it?
At IT’s About Uptime, we think there as just as many questions as answers when it comes to cloud computing. For example:
- What exactly is cloud computing, who does it best serve, and who is most impacted by this technology approach?
- How does one control such things as sensitive data in the cloud?
- How does the cloud support enterprise computing?
- Can you build your own cloud/s?
- What role does IT have in this data center that one doesn’t see or control?
- How will they manage performance, monitor users, satisfy SLAs, make and test changes… when relying on an outsourcer’s IT department to provide internet-based services?
Answers to the preceding questions depend on who is trying to shine light on the subject.
John Willis provides the IT Management and Cloud Blog, a collection of cloud related articles, links, podcasts, and… a common-sense introduction to cloud computing. I recommend that you check out John’s “What is a Cloud – Introduction” series of video/pods. John explains cloud computing in a casual manner that is easy to follow without trying to impress you with complex terms or detailed examples. Set aside some time as his initial video runs approximately 45 minutes, but it is helpful to those that want a practical cloud computing overview.
Take a peek at James Governor’s MonkChips blog for a slightly tongue in cheek “15 Ways” series about enterprise cloud computing. The trackbacks to James’ posts alone are worth the price of admission.
If you want insight into where cloud computing is headed, take a gander at CNET’s Business Tech article “The future of the cloud”.
But hey we’re just scratching the surface - simply Google “cloud computing blogs” for answers to all your cloud computing questions. You may not get the same answer twice, but you’ll have no lack of opinion.
(photo credit: Storm Crypt)
Popularity: 2% [?]
Filed Under: Cloud Computing
Posted November 14th, 2008 by Joe Pendry
Last week, the BBC experienced a DDoS attack that caused a significant amount of downtime for their website. The attack lasted five hours causing the downtime to spread over multiple short intervals, lasting just a few minutes each time. Royal Pingdom provides a diagram of the BBC website’s hourly average load time of the HTML page.
Mike Kavis, CTO of Kavis Technology Consulting, shares his presentation on ‘Managing Change in Business Transformation Environments.’ He discusses why transformational IT initiatives like BPM & SOA fail and offers on how to prevent failures. Mike also reveals his thoughts on the future of IT and how a well ran and planned enterprise architecture is a key aspect of a successful implementation of transformational change.
Demian Entrekin, founder of on-demand PPM provider Innotas, shared his 10 predictions for the Future of SaaS and On-Demand Software Applications. He says the key to driving adoption will be better product integration and alliance with other SaaS companies. Among the 10 trends include software without borders, more product alliances and shifting more energy to market strategy.
Peter Farquharson, technology integration service manager for the City of Saskatoon discusses ITIL implementation and change management for his organization. He points out that although the city is not doing a lot to promote ITIL, he is taking measures with change management systems. Farquharson wants to see a “citywide management system with a Web-based front end that allows citizens to ask for a pot hole fix and have it happen as seamlessly as the city can now respond to technology change requests or incidents.”
Popularity: 2% [?]
Filed Under: Change Management, Downtime, IT Operations, ITIL
Posted November 14th, 2008 by Jonah Paransky
Today’s post is a wrap up of the Exchange Connections 2008 conference where StackSafe exhibited this week. I had a chance to speak with many interesting exhibitors and attend some great sessions at the Fall Exchange Connections 2008 show, but I thought three of the sessions covered content of particular relevance to the readers of IT’s About Uptime.
Migration to Exchange 2007: the Front-End by Robert Dawson at HP
Robert provided a detailed overview of the process involved in the front-end migration of Exchange 2003 to Exchange 2007. He covered a variety of topics including:
- Prerequisites of coexistence
- Installing Exchange 2007
- Making the switch
- Links/References
Listening to the overview, Robert made a number of points that provide valuable lessons to a wider variety of upgrade projects.
- Plan – If you fail to plan, you plan to fail
- Gather information. You need to understand the environment before beginning the migration process. Robert particularly raised the issue that often teams overlook 3rd party applications that may rely upon the existing exchange infrastructure.
- Get the right people together for the migration. Exchange environments are often large, encompassing Active Directory servers, messaging components, security teams, network teams, etc. It is imperative to bring everyone together to properly prepare for the migration
Building an Exchange Test Environment in a Hurry by Michael B. Smith of The Essential Exchange
Michael covered the steps required to build a “quickie” test version of Microsoft Exchange in a virtual environment.
Michael made two points that relate to pulling applications into virtualized environments that seemed of particular interest:
- Beware of memory contention – some applications are memory hogs. Even in a hypervisor-based virtual environment, it is critical to properly size the server so that the performance of memory intensive workloads are appropriately maintained
- Beware of I/O contention – some applications have significant I/O requirements. Often the limiting factor in the performance of virtualized workloads is I/O, so keep a close eye.
I wish I’d known: Exchange 2007 Upgrade Lessons from the Field by Jim McBee of Ithicos Solutions and author of the Mostly Exchange Web Log
Jim covered lessons he has learned in the field performing multiple exchange transitions. A number of these lessons were universal as well, and worthy of mention:
- In larger organizations, don’t underestimate the amount of time and resources that will be required to prepare the Active Directory (AD) infrastructure. With tasks ranging from removing (or moving) legacy Server 2000 controllers to updated schema, Active Directory prep can be a significant undertaking. Jim specifically mentioned an example where the preparation required a 6 week approval cycle for a single AD change.
- Make sure to find all the existing applications that may rely upon the Exchange environment prior to beginning the transition project. Jim used an example of a customer that had 1000 applications that relied upon the Exchange infrastructure, each of which needed to be analyzed and tested prior to the transition.
Where does testing play in all of this?
All the sessions provided great detailed information. One thing that felt missing (perhaps due to our own focus on the topic) was a deeper look at actual testing activities required to insure a successful transition. A number of unanswered questions popped into my mind during all the sessions…
- How do you test to make sure the required Active Directory changes will not negatively impact production prior to release?
- How do you test the impact of the migration on related applications, before the infrastructure is deployed into production?
- How do you make sure that the proposed server roles are properly designed, before the infrastructure in active in the environment?
- How do you ensure a smooth migration, without production problems or disruption of mission critical messaging services?
I would like to hear your thoughts on any or all of these questions.
Popularity: 2% [?]
Filed Under: IT Operations, Testing, Virtualization
Posted November 12th, 2008 by Joe Pendry
IT’s About Uptime would like to thank Tarry Singh from the Avastu Blog for pointing us towards much of the content in today’s blog. He recently gave a presentation on virtualization titled “Shi(f)t Happens” that provided some good data about virtualization and where he sees it heading.
As Tarry mentions, in an environment where cutting operational expenditures is top of mind, it isn’t surprising that virtualization continues to be a hot topic. VMWare recently posted numbers that would seem to show that organizations are in fact spending more on virtualization and cloud technologies than what they call “pre-recessionary” spending levels.
“A poll conducted during this very well-attended webinar clearly indicated that spending continues in virtualization and cloud technologies at paces equal to and often far exceeding pre-recessionary levels. In fact, only 5.9% of respondents expected to see declining investment in these areas, which is quite a statement when you consider the downward pressure facing most IT budgets.”
Last month, Gartner released a list of top ten disruptive technologies that would help with cutting costs, and virtualization came in at number two.
We have discussed the benefits of virtualization on this blog in the past. Organizations can gain great economic benefits from using this technology. But they also must realize that it can also introduce a degree of complexity to the environment that can be problematic. Complexity can cause problems.
“Complexity is a good thing, because this normally accompanies flexibility and other benefits to the business that IT is trying to support. The problem is this: complexity can also make it more difficult for IT operations teams to understand where problems originate when something goes wrong. This has a tendency to cause IT Operations teams to be very conservative about anybody touching their environments.”
As Tarry mentions in his presentation, managers are worried about the best way to implement virtualization in order to gain the benefits. If this isn’t done correctly, virtualization could cause more work that it is intended to save. Among the questions are:
- Which virtualization vendor?
- What server platform to use?
- To deploy in production or not to deploy in production?
- How do I align my people to manage virtualization correctly?
With budgets shrinking, the consequences of getting these questions answered correctly are even higher. Managers have both more to gain and more (control) to lose.
Popularity: 2% [?]
Filed Under: IT Operations, Virtualization
Posted November 7th, 2008 by Joe Pendry
The country elected Obama as the 44th President of the United States and Royal Pingdom applauds his site for an impressive 100% uptime in the six months leading up to the election. Royal Pingdom says to achieve a 100% uptime requires two things, “a good web hosting company and a skilled webmaster to maintain the site.” McCain managed a 99.96% uptime.
A new report from Gartner predicts a ‘worldwide enterprise SaaS market will surpass $6.4 billion this year, up 27% from 2007.” Sharon Mertz, the research director at Gartner, recognized the popularity and growth of the on-demand deployment model. The factors for deployment are the current economic state, better broadband and a need to ‘rapidly deploy software to meet a specific business need.’
In a recent survey conducted by IDC and sponsored by HP, a majority of IT organizations at large companies worldwide see reduced IT costs as a potential outcome of implementing ITIL. The survey says more than one-quarter of organizations in the Americas are already implementing ITIL and ‘over 60% indicated they plan to bring the best practices in house within the next one to six months.’
John Michelsen, iTKO’s co-founder and “Chief Scientist” shared his three step process for successful SOA testing: development builds the testing framework, development builds initial test cases, and QA enhances the test cases. Based on his customer base, Michelsen said one of the characteristics of a successful SOA implementation is integrated testing in the development cycle.
Popularity: 3% [?]
Filed Under: Downtime, IT Operations, ITIL, Testing
Posted November 5th, 2008 by Joe Pendry
This is the latest update to our ranking of the top 10 IT Operations blogs. Our first inaugural ranking can be found here. This list will serve as a friendly ranking of blogs that focus on issues important to IT Operations teams. From business project management to ITIL, we’re reviewing it all. Don’t see a blog that should be in the list? Please let us know!
Again, our methodology for ranking will be measured according to Technorati Authority (by the number of sites/blogs linking to each other). For the blogs without a Technorati Authority (those not registered), the number of blog reactions will be divided by 3 for an estimate.
We understand Technorati might not be the best measurement, but it is open, and here at IT’s About Uptime we show no bias towards blogs. We want to share with our readers the blogs that provide relevant news for IT Operations professionals. The top 10 list will be updated and posted on a monthly basis, noting changes in rankings, new blogs and up and coming blogs.
For those who are on the list, please feel free to place the following badge on your blog to share your ranking.
![clip_image001[1]](http://www.stacksafe.com/blog/wp-content/uploads/2008/11/clip-image0011-thumb.jpg)
Our second ranking follows. We had a lot of movement this week. Technorati recently changed their formula, so many blogs have seen a drop in their authority over the past couple of weeks. In fact, six of our ten blogs saw a drop in Technorati Authority.
Even so, ComputerWorld remains at the top. Other changes include Avastu (Tarry Singh) moving up from number 5 to number 4, which moved IT Skeptic down to number 5. BitCurrent and The Forrester Blog for IT Infrastructure and Operations Professionals are currently tied in 6th place with the same Technorati Authority and rank. Doug McClure moved down a step while the Hot Aisle secured the number 7 spot with an increase of three.
- ComputerWorld – The Voice of IT Management 1,041 – no change
- IT PRO 348 -51
- Inside Architecture –51 – no change
- Avastu (Tarry Singh) 28 -10
- IT Skeptic 27 -17
- BitCurrent 21 -14
- The Forrester Blog for IT Infrastructure and Operations Professionals 21 -4
- The Hot Aisle 16 +3
- Doug McClure 11 -12
- Virtual Lab Automation Blog 3 –no change
Popularity: 3% [?]
Filed Under: Top 10 IT Operations Blogs
Posted November 4th, 2008 by Joe Pendry
Kurt Milne, managing director of independent research firm, the IT Process Institute (ITPI), took a moment to answer some questions on best practices for application upgrades and more. Kurt has over 15 years experience in various marketing management, alliance management, and engineering positions at leading technology companies such as BMC Software, Remedy Software and Hewlett-Packard. His main areas of expertise include IT service management and IT controls, inventory and supply chain management, and computer integrated manufacturing. At ITPI, Kurt is responsible for overall ITPI operations including sponsorship and membership. Kurt is also on the strategic advisory board for the IT Infrastructure Management (ITIM) Association.
More detail from Kurt’s responses and answers to similar questions are spotlighted in StackSafe’s Best Practice Guide: Application and Infrastructure Upgrade with focus on pre-production release and testing. You can download the entire guide for free.
StackSafe: How did you get the data for the recent report of application upgrades? How many companies did you research?
Kurt Milne: We used data and analysis from multiple IT Process Institute studies to develop our summary of best practices related to the application and infrastructure upgrades. Our study methodology is focused on identifying practices that have a measurable impact on a range of key IT operating performance measures. We conducted three different web-based surveys and collected data from over 1000 companies that was used to identify a range of practices that best predict top performance across the software development and production release lifecycle. Our study of Change, Configuration and Release practices was referenced in the footnotes of the best practice guide. Free summaries of all our studies are available on the whitepaper section of www.itpi.org.
StackSafe: Can you explain the AIM framework and its impact on organizations?
Kurt Milne: We put our recommended practices into a staged framework that made sense to the serial CIO on our study team. Most organizations want a simple way to scale into new practices to help build momentum for organizational change. Using a staged approach for implementing application and upgrade best practices across the development and release lifecycle helps identify where to start, and also build a vision for a top-performing future state. The Adjust level practices identify initial changes that should yield immediate measurable results. The Improve level practices build on early success and identify additional ways to increase performance. The Master level practices round out our recommendations and should help organizations achieve operational excellence and top levels of performance.
StackSafe: How do process enhancers like ITIL and COBIT fit into your recommendations for best practices?
Kurt Milne: We develop our data collection surveys based on what we hear during IT executive interviews. Many of the practices identified during those interviews come from the study participant’s use of frameworks such as ITIL and COBIT. We don’t set out to test those frameworks specifically. But our approach often identifies the handful of practices, out of literally hundreds of practices outlined in these frameworks, that are shown to have a broad impact on the sample of companies in our studies. That doesn’t mean that the other practices in these frameworks should not be considered or used by IT groups. But our approach identifies those practices that are commonly used by organizations with higher levels of performance.
StackSafe: How can an IT Operations teams measure the success of their efforts in configuration management and application upgrades?
Kurt Milne: We recommend two types of measures.
- Process measures that verify practices and controls are implemented.
For each area of the best practice guide, we offer several measures that help indicate whether those practices are being followed. For example, in the configuration management section, we suggest measuring the number of systems that are verified to match golden build, and the time to rebuild a system from bare metal.
- Outcomes measures that indicate the operational performance of the organization.
For each area of the best practice guide, we show measures that are likely to improve as a result of implementing the practices, and determined from our empirical studies. For example, in the section related to linking change request to business and infrastructure context, our studies have shown that the use of recommend practices strongly correlate with lower release rollback rate, and reduced configuration drift. The combination of process and outcomes measures helps IT organizations simultaneously focus on the practices and their impact.
StackSafe: How does a company’s culture impact their ability to effectively change and manage their current processes?
Kurt Milne: Having a process culture is one of the strongest predictors of top performance we have uncovered in our studies. It makes sense logically, that if you drop best practices into an IT shop that doesn’t have a history or culture of following documented procedures, then those practices will not be consistently followed. If best practices are not consistently followed, then the desired results are uncertain. My opinion is that the process culture is more important than the specific practices. In other words, consistently following a sub-optimal process is likely to give better results than inconsistently following a best practice. This is especially true in the control areas of access management, change management, and configuration management. Variability in these processes introduces unknown levels of operation and security risk.
StackSafe: What steps can companies take to ensure that infrastructure and application upgrades enhance business capabilities and minimize disruption?
Kurt Milne: The longer I study IT operations, the more I am convinced there is a basic set of operational practices that should be implemented in some form, by all IT organizations. This best practice guide offers a simple set of practices, in a staged format, that IT shops should consider for optimal performance.
Popularity: 3% [?]
Filed Under: Change Management, IT Operations, IT Operations Research, ITIL, Interviews, Interviews-Analysts, Testing
Posted October 31st, 2008 by Dennis Powell
Just my opinion here, but it’s probably best before releasing a change to a production system, for IT to have some idea of how that change will impact production. I’m guessing that your IT Release Manager would agree. However, when our research shows that 25% of all changes put into production cause some type of negative impact, and 10% of all changes have to be rolled back because they can’t be fixed…sort of indicates that something is missing from the change and release management process.
To be fair to the IT testing community, there are a several significant challenges to adequately test changes targeted for today’s multi-tiered software infrastructures. We’ve talked about these in the context of the IT Operations Perfect Storm, in which IT is pressed to test a high volume of different changes (everything from a patch to an application migration) against very complex software infrastructures that are interdependent with other systems (some not under IT’s direct control) in a business climate that expects nearly 100% uptime. So, some of what is missing is the commitment to testing fundamentals of budget, resources, expertise, and time.
Even for those organizations that make the preceding fundamental commitment, testing success can be limited by the time needed to build and maintain not just a representative staging and testing environment, but the tests, reports, and certification functions to facilitate testing. So part of the problem can be linked to a lack of comprehensive pre-deployment testing solutions.
Jasmine Noel of Ptak Noel and Associates, LLC participated with StackSafe in a webinar on Wednesday October 29th to discuss the other factor that is missing: a focus on effective process identified as change impact management. Change impact management (CIM) has commonly been defined as a method to evaluate change at a business service level to determine impact to the execution of a business process. Jasmine drove this definition a step further to the actual testing level.
CIM satisfies with three primary testing objectives:
- Validate that the configuration of a proposed change maps to established policy
- Thoroughly test changes to clearly identify the impact on production
- Monitor the behavior of the environment after the change is deployed
An effective CIM solution will:
- Draw conclusions based on real data
- Help IT understand complex environments
- Help IT understand the sequence of events to satisfy a change requirement
- Help IT understand how the system behaves based on a change
- Be easy to use to help IT focus on the results rather than process
Our experience shows that there are five additional factors that will ensure CIM success:
- Establish a formal staging and testing environment
- Test every change that is planned for production
- Test changes across the end-to-end infrastructure
- Utilize automated change management solutions
- Adhere to best practice guidelines
When you combine the commitment to test with the right solution and the right process, you should (as Jasmine states) “get the benefit of [testing] hindsight without actually living through a full-scale disaster!”
Popularity: 3% [?]
Filed Under: Change Management, IT Operations Research, Testing
Posted October 31st, 2008 by Joe Pendry
We posted our thoughts on the London Stock Exchange downtime, and Pingdom provides some more interesting stats. Over the past decade, a variety of stock exchanges have experienced downtime. A reason for this could be the fact that “stock exchanges usually have strict SLAs and high demands on availability, but they can’t completely avoid downtime.” The outages listed include problems with software upgrades, power outages, and software bugs.
Martin over at Blade Watch (congratulations on two years!) asks whether or not IT and business are set to become more aligned. He says, “It’s not enough to do IT, we need the team to understand the infrastructure and the application to know what happens if server17 fails, what if feed4 breaks, how to fix or co-ordinate issues, how to manage and continually improve IT in terms of service and delivery.”
We had the opportunity to speak with the IT Skeptic and John Willis in the past. It seems we also helped to set up a fine connection between the two as both posted brief Q&A’s with one another. Check it out for more information on ITIL, ITSMF, and what makes operations management so exciting.
Does Google really have 99.9% uptime? According to TechCrunch, Google announced that the enterprise version of Gmail comes with a 99.9 percent uptime guarantee. Google is extending that guarantee for enterprise customers to Google Calendar, Google Docs, Google Sites, and Google Talk. Radicati Group conducted research to compare services like Microsoft Exchange and found that Google and Gmail were four times as reliable.
Popularity: 4% [?]
Filed Under: Downtime, IT Operations, ITIL