Monday 3 December 2007

How to best manage your databases

Sunday Business Post - Computers in Business magazine - Dec 2 2007

The latest storage products excel in keeping the data much closer to the end user, writes Dermot Corrigan.


Most large organisations now generate staggering amounts of data. This includes reams and reams of externally facing information on customers, suppliers and partners, as well as masses and masses of internally-facing information on products, processes and people.


This information has to be stored in a usable form so it can be exploited to grow the business, keep track of products or meet regulatory requirements.


This means that databases are continuing to grow in size at a phenomenal rate, according to Bill O'Brien, platform strategy manager with Microsoft Ireland.


“If you think back two years ago a one terabyte database was considered enormous, but now in the UK they would have four or five databases bigger than 16 terabytes and Ireland would have lots of organisations with greater than one terabyte databases,” he said.


One terabyte (TB) contains 1,000,000,000,000 bytes, which is a lot of zeroes and a lot of information. O'Brien said large organisations were using advances in digital storage technology to electronically store information that would previously have been kept in different formats such as images, sound and video.


“Enterprises are creating and storing much more digital information than they did in the past,” said O'Brien. “They used to just store text and numbers, the traditional stuff such as customer information, transactions et cetera, but they are now digitising pictures, invoices, sound, geo-spatial map information. There is a huge amount of additional material being stored and database technology has had to deal with this.”


Christian Blumhoff, NetWeaver business consultant with SAP, said many legacy database systems stored information too far away from the end user, so that it is not available quickly enough to use effectively.


“Over the last two years the database has become a bit of a bottleneck, in that it does not fulfil the requirements of a modern enterprise,” said Blumhoff. “It does not react fast enough to aggregative requests for information.”


Blumhoff said the latest storage products keep the data much closer to the end user, where it can be quickly accessed on demand to support business processes.


“A modern organisation wants to retain its information in a memory state that it can get at very quickly,” he said. “If you look at knowledge management, document imaging, enterprise search, all of these business process critical toolsets are not operating at a database level, although they might store data there for persistence sake. All the realtime processing and execution is actually done in memory.”





O'Brien said the latest database products included business intelligence tools that provide information directly to different people within the organisation, enabling them to do their jobs better and smarter.

“It is about working smarter with the data that you have,” he said.

“At executive level there are dashboards, a single place where executives can view a number of different metrics from a number of different sources. There is drilling into the data beneath that, maybe to highlight an area of the business that is not performing, perhaps sales in a particular region. A few years ago that would have meant asking your finance department to go and do some research. Now managers expect to be able to instantly drill down into the data and analyse it live themselves.”

O'Brien said reports were provided in the way that best al lows decision makers to act cleverly and quickly.

“That is not necessarily just screens with numbers, it can be visualisation in meaningful ways as well,” said O'Brien. “It can be simple things like traffic lights with green, amber and red to show the state of the business at any time, and whether objectives are being met.”

Quality

It is all well and good having lots of data stored, and being able to get quick access to it, but businesses must also ensure the information they are keeping is correct, according to Alys Woodward, business intelligence and analytics program manager, with global technology research firm IDC.

“Data quality can be a massive problem for organisations,” said Woodward.

“It is also one of those problems people do not know they have until they have investigated it. We say to people that if you do not know you have a data quality problem, then you do have.”

Woodward said organisations should look carefully at the information they have stored to determine its data quality.

“You start off with an investigation process and there are a number of tools that can help you do that,” she said. “They actually go into a database and tell you what tables you have, what values you have in the tables, and how those tables match together.”

Organisations from different sectors will have different data quality priorities, according to Woodward.

“Then you would look at which parts of your data are the highest priority for you to fix, maybe making sure your customer data, names and addresses, is correct if you are a marketing company. Or if you are a manufacturing company you may decide that sorting your product information out is more important,” she said.

“You would also set levels of data quality that are acceptable, as getting 100 per cent data quality is much more difficult than getting 97 per cent.”

Woodward said once organisations have cleaned up their data, they could then put in place mechanisms to ensure a high level of data quality going forward.

“For example if you have cal l centre operators inputting data about customers, you might put rules in place to make sure they do things correctly,” she said.

“Or if you have a lot of web forms where customers are filling in their own information, you cannot necessarily trust them to spell everything correctly. The classic example is you do not just ask the customer for their post code and type it in, you have address-matching capability in there, so you make sure the post code is valid for the road.”

Costs

Ian Devine, director of data services and business intelligence solutions with IBM, said that as databases get bigger and bigger, organisations were looking for efficient storage technologies, to keep their costs manageable.

“Storage is growing and growing, it is an increasing component of operational managers' budgets and they are expecting us to provide a far lower cost of ownership,” he said.

Devine said compression is important if you want to efficiently store your data.

“We compress data to table level,” he said. “So you create a data dictionary which provides you with the compression. We are seeing compression ratios of better than 70 per cent for regular applications, whether transactional or warehouse. That provides a real saving for people whose storage is growing 10 or 20 per cent per annum.”

Human management costs can also add to the cost of storing huge amounts of information, according to Devine.

“We have put a lot of effort into autonomics, where you are able to run databases with little or no intervention from the main cost of running things, which is humans,” he said.

Open source

One way to keep database costs low is to choose an open-source solution, which can be downloaded free of charge.

“Open source is generally slimmer and does not have the richness of what the big guys do,” said Woodward. “But it still meets quite a lot of people's needs. If you are just looking at storing data and doing some queries then the open source vendors might be perfectly good.

“The big guys say they are very simple and can not do this or that, but the end users may have very simple requirements.”

MySQL is probably the best-known open source database, and now has 11 million installations globally, according to Joe Morrissey, managing director for Britain and Ireland of MySQL AB.

“At MySQL we keep it simple,” he said. “We have a 15-minute rule - you should be able to download and install it in that time. Reliability and ease of use are priorities for our software.”

Morrissey said open source database products were often used in different ways than traditional enterprise databases. For example internet and telecommunications companies formed a large chunk of its customers.

“Google does not use any traditional enterprise databases for its adwords business,” he said.

“Also we are increasingly the choice for software companies that wish to embed a database and provide a batteries included solution for their customers.”

Morrissey said open source databases al lowed organisations to store data that it would have been too expensive to retain in the past.

“80 per cent of the world's data is not in a database and it should be,” said Morrissey. “It has been either too costly or too complex to store all data in a database. Our mission is to make databases available and affordable to all.”

Features

Increasing regulatory requirements mean many organisations have a legal obligation to keep accurate records. Devine said the latest database products can help companies do this.

“Being an American company we are painfully aware of things like Sarbanes Oxley and Basel and how important regulation is,” he said. “All the regulatory stuff is worked into the product.”

The ability to access information from the database at all times is also a priority for many organisations.

“Databases now run on phones, PDAs and embedded devices,” said O'Brien. “They are not just things that are running in the server room.”

Many enterprise level organisations now leave parts of the database open so that partners, such as customer, suppliers or consultants, can have access to the information they need to best interact with the business.

“The whole idea of self-service processes, enterprise services, web services and application services that are opened up to self-service style processes are critical,” said Blumhoff.

Privacy is also a big concern for organisations that have stored sensitive data.

“A lot of people want to encrypt their information, and they do not want that to have any impact on their applications,” said O'Brien.

“They want the database to go off and do that itself.” O'Brien said one way for the larger database vendors to add innovation to their products was to buy up smaller players who introduce interesting tools or services into the market.

“There are a lot of acquisitions in this space,” he said. “Microsoft purchased Proclarity, Oracle purchased Hyperion and Siebel and SAP purchased Business Objects. They are all business intelligence providers at one level or another. We can all compete about the merits of our database technology, but the coming battleground is around how people access and use that data.”

Devine said IBM was also active in acquiring new ideas to improve its database products.

“We are always looking at companies that can accelerate our time to market,” he said. “If we think there is something there that can fill a gap or solve that problem for us then we do that.”

Modification

Many large organisations develop their storage systems to best suit their individual requirements, according to Blumhoff.

“Most applications will have an extended database space, where people can add custom code,” he said.

“So most databases will look significantly different from one organisation to another.”

O'Brien said, however, the core database product remained as standard, with modifications varying from organisation to organisation in how they best implemented the solution.

“They typically buy the product as it is and all of the features are in there,” he said. “People do not really customise the core capability, most of the work with data is in what information people get access to, how it is presented and who gets access to what data.”

Woodward said many enterprises were running a number of different databases in different areas of their organisation, which was not the most efficient practice.

“It just does not happen that companies know exactly what they want to do and go in on a green field site,” she said. “You get one department that puts in a system, and then another department does something else. It would have made sense for them to discuss it first and put the same thing in, but it is just not how it tends to work.

“Part of IT's job is then to say we now have five Oracle databases, why do we not consolidate them and make things more efficient and streamlined,” she added.

Organisations can buy specific tools from smaller vendors, which they can then use to improve their access to or use of the information stored in their databases, according to Woodward.

“Even reporting tools from small companies will integrate with all the major databases,” she said. “As a software vendor you need to be able to say your reporting tool can be used everywhere.”

Woodward said the amounts of data being collected by enterprises would continue to grow and grow.

“For example once RFID (radio frequency identification) takes a foothold, very high volumes of data will come in,” she said. “Take something like dairy goods that have to be chilled at all times, from storage to the shop-floor, they are looking at including temperature sensors to check at intervals to make sure the temperature of the goods remains the same.”

“If you have volumes of data like that it is an order of magnitude greater than what we now have. The more data you have the greater the challenge is to get something out of it, and present it in a way that a human being can use it to make a decision.”

At present the majority of information in databases is structured, i.e. it fits within clearly designed parameters or fields. In the future the data stored might not be so simply organised, said Woodward.

“One interesting innovation will be the introduction of unstructured data, that is not held in rows and columns,” she said. “Increasingly customers are looking at the way you would search using Google and looking to bring that into the database. That is where some interesting stuff will happen, but whether it will happen over the next 12months I am not sure.”

Blumhoff said the model of large storage units in dusty basements within the enterprise, would be slowly phased out.

“We are moving towards service-based relationships, where I do not need to retain the information locally all the time, I have a trusted partner that actually manages the infrastructure for me,” said Blumhoff.

“From a database perspective that means actually moving into a shared-service environment and offering repository and memory based services for the retrieval of information.”

In the longer term Blumhoff said the database as we know it would fade away.

“What has made a step forward is the virtualisation of data into memory,” said Blumhoff. “There will continue to be an emotional need for a database, so we can go and find the data on the disc in a format we can understand, but from a technology perspective, things like virtual computing, grid computing and service based architecture will require different things from databases and I think that is where we will see the transformations. Overall we will see the database slowly but surely disappear.”

No comments:

Post a Comment